From mark at alchemedia.jp Fri Feb 3 01:29:41 2006 From: mark at alchemedia.jp (Mark Korver) Date: Fri Feb 3 01:22:34 2006 Subject: [tiling] Example Zero Message-ID: <43E2F855.5060501@alchemedia.jp> We've done some work in the past looking at tiling/caching schemes for both WFS and WMS requests and also for jp2K and DEM files that we need to hold in a "pyramid" for JetStream-3D, our 3D viewer applet. Clearly if the client can get data directly from the file system you have much better scalability. This allows you to have one to many relationship between the master WMS and slaved forward machines that are just running HTTP behind a load balancer. It is also easier to do multiple locations for a failover or a globally load balanced setup. You just need to have a mechanism to populate the HTTP boxes when data is changed or created on the WMS. For large data sets, you need some kind of XY file naming scheme, including a directory scheme that addresses the structure that you need to be able handle extremely large collections of files. Just putting it in the cache is not enough. Directories with too many files are slow. Of course, if the map is small, you don't need any of this, as mentioned earlier, Squid will do. For our 3D viewer applet that uses jpeg2K data we use a 1024x1024 sized file. This has to do with the 3D engine. For aerial data that is 25cm per pixel that comes to about 250 meters a side. Imagine how many files you would need to cover a large country let alone the whole world as Google Earth does. one of our Jpeg2K files are named like this. 00x132136000y34017600.jp2 The OO is used for metered coordinate systems and is unused in this file. The rest of the name is the decimal degrees value for the upper left corner of the image 132.136000 34.017600. Because our goal was to come up with a dir/naming scheme that would keep the max number of files in any one directory at a reasonable level we came up with this. jp2/chuden/1600/pp10/33/24/10/00x132136000y34017600.jp2 dem/chuden/1600/pp10/33/24/10/00x132136000y34017600.bdm filetype/project_name/scale/ppX[1]Y[1]/X[2]Y[2]/X[3]Y[3]/X[4]Y[4]/00xX[1]X[2]X[3]X[4]XXXXX[4]yY[1]Y[2]Y[3]Y[4]YYYY.ext Basically we just use the file name to create the directory structure. As it turns out if you sub divide four levels from the left of the DD value you maintain a reasonable file count. Sorry I can't remember what the actual number was. Because these Jpeg2K files internally have 4 layers, the next level down in resolution is a tile that is 8X the size of the 1600 level, so 12800. This is different from simple raster formats like png that should have something closer to 2X in difference. The path and file name for the next level would look like this. jp2/chuden/12800/pp10/33/24/10/00x132134400y34022400.jp2 This is how is works out in terms of tile width and pix/units. LAYER ; TILE WIDTH DD ; IDEAL PIXEL/UNIT RATIO layer 1 ; 0.0016 (1600 Dir) ; 0.0000015625(about 0.142 meters) layer 2 ; 0.0128 (12800 Dir); 0.0000125(about 1.136 meters) layer 3 ; 0.1024 (102400 Dir); 0.0001(about 9.1 meters) layer 4 ; 0.8192 (819200 Dir); 0.0008(72.72 meters) (SRTM level) and so on.. One thing that may notice here is that jp2 and dem files are separated at the top level (above the project name). This is because their file sizes are an order of magnitude different and require different file systems and HD sector sizes. This is important for access speed (one sector read instead of two) and also for efficiency when using Terabyte class storage. We found that ReiserFS can be many times faster than ext2 when handling files smaller than one k in size. FYI, when generating these jpeg2K files we use web services based grid clients to query a small cluster of Mapservers over the company LAN. The "job" machine receives the newly jp2K encoded files from the clients and sends them on to the off-site (data center) "streaming" server's web service "content manager". The content manager, using the project name, scale value etc. writes files to the appropriate directory. Hope this helps. - Mark On 1/26/06, Adam Hill wrote: > To get the ball rolling - Here is a PDF that describes how NASA > WorldWind / Punt does tiling currently. > > http://www.ceteranet.com/nww-tile-struct.pdf > > In the configuration data on the client side for a given Layer we have > a NumberOfLevels, LevelZeroTIleSize Degrees and TileSize. Its just a > simple powers of 2 system. > > This is currently being served for the WW community by: > 1) NASA using just a filesystem backend for almost all of WW's > datasets, including elevation. > 2) Terraserver, which pulls the tiles from a SQL database > 3) OnEarth with a WMS + a smart CGI script + a cache for WW, written > by Lucian Plesea. > 4) The Free Earth Foundation server for the Zoomit Dataset (Robbin > Island, Mass.gov hires and some other hi-res but small datasets) and > others using a packed tile scheme (pyramids) and some simple PHP. > > All of the above respond to > "/tileset?L=&X=&Y=" HTTP > GET's. > > The mysteries to me are - picking optimal LZTSD (can one size really > fit all?) and outside of how much bandwidth it consumes is a tile size > of 512 really so much better/worse than 128 or even 1024? > _______________________________________________ > tiling mailing list > tiling at lists.eogeo.org > http://lists.eogeo.org/mailman/listinfo/tiling > -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.375 / Virus Database: 267.15.0/248 - Release Date: 2006/02/01 From steven_morris at ncsu.edu Tue Feb 7 14:01:48 2006 From: steven_morris at ncsu.edu (Steve Morris) Date: Tue Feb 7 14:01:58 2006 Subject: [tiling] Example Zero In-Reply-To: References: Message-ID: <43E8EE9C.1070808@ncsu.edu> I don?t have a solution to offer for the standardized tiling problem but would like to throw another use case into the mix. We are involved in a project focused on long-term preservation of digital geospatial data and one of the approaches we are considering is automated capture of tiled images from WMS services?including rasterized vector data. Obviously we are most focused on getting and keeping the underlying data, but that is not always possible for a variety of reasons having to do with scalability and the need to find automated processes for archive development (and WFS not being widespread), etc. We are also looking at ?dessicated data? options, like WMS captures, as a fail safe alternative to the underlying vector data--which we may actually fail to preserve over the longer term. The tie-ins to the tiling discussion are: 1) if we do try to automate WMS captures it would be best if we could try to align the captures with a widely accepted tiling scheme, 2) if WMS caches are already being created according to a standard tiling scheme then we might want to just go ahead and grab those as part of the archive development process, and 3) we are interested in how the temporal dimension will play out in the tiling discussion?i.e. what is the relationship between WMC documents having a temporal dimension and superceded tiles? Steve Adam Hill wrote: >To get the ball rolling - Here is a PDF that describes how NASA >WorldWind / Punt does tiling currently. > >http://www.ceteranet.com/nww-tile-struct.pdf > >In the configuration data on the client side for a given Layer we have >a NumberOfLevels, LevelZeroTIleSize Degrees and TileSize. Its just a >simple powers of 2 system. > >This is currently being served for the WW community by: >1) NASA using just a filesystem backend for almost all of WW's >datasets, including elevation. >2) Terraserver, which pulls the tiles from a SQL database >3) OnEarth with a WMS + a smart CGI script + a cache for WW, written >by Lucian Plesea. >4) The Free Earth Foundation server for the Zoomit Dataset (Robbin >Island, Mass.gov hires and some other hi-res but small datasets) and >others using a packed tile scheme (pyramids) and some simple PHP. > >All of the above respond to >"/tileset?L=&X=&Y=" HTTP >GET's. > >The mysteries to me are - picking optimal LZTSD (can one size really >fit all?) and outside of how much bandwidth it consumes is a tile size >of 512 really so much better/worse than 128 or even 1024? >_______________________________________________ >tiling mailing list >tiling@lists.eogeo.org >http://lists.eogeo.org/mailman/listinfo/tiling > > > -- Steve Morris Head of Digital Library Initiatives North Carolina State University Libraries Phone: (919) 515-1361 Fax: (919) 515-3031 Steven_Morris@ncsu.edu From adoyle at eogeo.org Mon Feb 13 13:39:26 2006 From: adoyle at eogeo.org (Allan Doyle) Date: Mon Feb 13 13:39:29 2006 Subject: [tiling] Example Zero In-Reply-To: <43E8EE9C.1070808@ncsu.edu> References: <43E8EE9C.1070808@ncsu.edu> Message-ID: <67AD9170-0E3E-41F1-A7EB-5C355BD7DCB0@eogeo.org> I think use cases are important and it's good to see a bit of variety in what people think they could do. Allan On Feb 7, 2006, at 14:01, Steve Morris wrote: > > I don?t have a solution to offer for the standardized tiling > problem but would like to throw another use case into the mix. We > are involved in a project focused on long-term preservation of > digital geospatial data and one of the approaches we are > considering is automated capture of tiled images from WMS services? > including rasterized vector data. Obviously we are most focused on > getting and keeping the underlying data, but that is not always > possible for a variety of reasons having to do with scalability and > the need to find automated processes for archive development (and > WFS not being widespread), etc. We are also looking at ?dessicated > data? options, like WMS captures, as a fail safe alternative to the > underlying vector data--which we may actually fail to preserve over > the longer term. > > The tie-ins to the tiling discussion are: 1) if we do try to > automate WMS captures it would be best if we could try to align the > captures with a widely accepted tiling scheme, 2) if WMS caches are > already being created according to a standard tiling scheme then we > might want to just go ahead and grab those as part of the archive > development process, and 3) we are interested in how the temporal > dimension will play out in the tiling discussion?i.e. what is the > relationship between WMC documents having a temporal dimension and > superceded tiles? > > Steve > > > > Adam Hill wrote: > >> To get the ball rolling - Here is a PDF that describes how NASA >> WorldWind / Punt does tiling currently. >> >> http://www.ceteranet.com/nww-tile-struct.pdf >> >> In the configuration data on the client side for a given Layer we >> have >> a NumberOfLevels, LevelZeroTIleSize Degrees and TileSize. Its just a >> simple powers of 2 system. >> >> This is currently being served for the WW community by: >> 1) NASA using just a filesystem backend for almost all of WW's >> datasets, including elevation. >> 2) Terraserver, which pulls the tiles from a SQL database >> 3) OnEarth with a WMS + a smart CGI script + a cache for WW, written >> by Lucian Plesea. >> 4) The Free Earth Foundation server for the Zoomit Dataset (Robbin >> Island, Mass.gov hires and some other hi-res but small datasets) and >> others using a packed tile scheme (pyramids) and some simple PHP. >> >> All of the above respond to "/tileset?L=&X=> coord>&Y=" HTTP >> GET's. >> >> The mysteries to me are - picking optimal LZTSD (can one size really >> fit all?) and outside of how much bandwidth it consumes is a tile >> size >> of 512 really so much better/worse than 128 or even 1024? >> _______________________________________________ >> tiling mailing list >> tiling@lists.eogeo.org >> http://lists.eogeo.org/mailman/listinfo/tiling >> >> > > -- > Steve Morris > Head of Digital Library Initiatives > North Carolina State University Libraries > Phone: (919) 515-1361 Fax: (919) 515-3031 Steven_Morris@ncsu.edu > > > _______________________________________________ > tiling mailing list > tiling@lists.eogeo.org > http://lists.eogeo.org/mailman/listinfo/tiling > -- Allan Doyle +1.781.433.2695 adoyle@eogeo.org From bob.basques at ci.stpaul.mn.us Wed Feb 15 09:18:15 2006 From: bob.basques at ci.stpaul.mn.us (Blammo) Date: Wed Feb 15 09:18:09 2006 Subject: [tiling] New to list. Message-ID: <43F33827.6060506@ci.stpaul.mn.us> All, I'm doing a lot of the stuff described in here already. At least the stuff I've read from the archive already. Look for more posts as I work my way through the archives. bobb From bob.basques at ci.stpaul.mn.us Wed Feb 15 10:36:31 2006 From: bob.basques at ci.stpaul.mn.us (Blammo) Date: Wed Feb 15 10:36:31 2006 Subject: [tiling] Ok, got through the Archives . . . Message-ID: <43F34A7F.8090300@ci.stpaul.mn.us> All, I got through the archives . . . I thought twere more of them. I still need to ponder stuff some, but from my experience. . . . * The Temporal aspect is a viery important piece. Even WMC doesn't address all concerns. * I'm not sure a common tiling scheme approach is a definitive way to accomplish a cascading mechanism. First it would take time to be adopted and also, how the heck would one promote the idea of implementing something like this on old databases. * I've always thought something like this would rely on an Automated client (Mapserver in Client mode for example) as a means of pushing the data to another service. * The question of Updates and of when and where they happen is also a big piece, what's the best way for a service to announce that a change has taken place? Is there even a way to handle this type of change log being pushed out? More thoughts are stewing, those were just the easy ones to articulate. :c) bobb -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.eogeo.org/pipermail/tiling/attachments/20060215/3c903ef5/attachment.htm From adoyle at eogeo.org Wed Feb 15 10:58:43 2006 From: adoyle at eogeo.org (Allan Doyle) Date: Wed Feb 15 10:58:42 2006 Subject: [tiling] Ok, got through the Archives . . . In-Reply-To: <43F34A7F.8090300@ci.stpaul.mn.us> References: <43F34A7F.8090300@ci.stpaul.mn.us> Message-ID: <8C561A4A-5D24-45AA-A637-4C2F4415EF9E@eogeo.org> On Feb 15, 2006, at 10:36, Blammo wrote: > All, > > I got through the archives . . . I thought twere more of them. > > I still need to ponder stuff some, but from my experience. . . . > The Temporal aspect is a viery important piece. Even WMC doesn't > address all concerns. Think of this as an 80/20 problem. Temporal may be in the 20% we are not solving. The real key is that if we can develop a tiling scheme for the 80% most-used layers, we can blow the proverbial doors off of the notion that WMS is slow. The "long-tail" layers, including temporal ones can be requested by clients as well, but they don't need to come from a cache. They can come from standard WMS servers. > I'm not sure a common tiling scheme approach is a definitive way to > accomplish a cascading mechanism. First it would take time to be > adopted and also, how the heck would one promote the idea of > implementing something like this on old databases. This is not strictly to do cascading. Think of this more as a means of pre-rendering the most commonly needed tiles. > I've always thought something like this would rely on an Automated > client (Mapserver in Client mode for example) as a means of pushing > the data to another service. Sort of. If we develop a tiling spec, then the assumption is that it's up to a client to make requests for the tiles. Imagine a client that only ever issues getmap requests for a very constrained set of bbox boundaries and width/height combos. The client is free to do things like Google Maps, i.e. cache tiles locally, prefetch. etc. Or it can do things like Google Earth, osgPlanet, etc. and wrap/warp the tiles around a virtual globe. The key is that by constraining itself to a limited set (e.g. Google Maps has a set of about 17 levels of zoom with three possible renderings - map, satellite, and combined) the client can benefit from tiles that may have been rendered for other clients previously. > The question of Updates and of when and where they happen is also a > big piece, what's the best way for a service to announce that a > change has taken place? Is there even a way to handle this type of > change log being pushed out? Updates can either happen silently at the server side and the client always goes to the ur-server rather than to other caches (this is the Google maps/earth model), or there can be some way to use the built- in cache coherency mechanisms of the various transports. There can be time-to-live values associated with tiles, for instance. > More thoughts are stewing, those were just the easy ones to > articulate. :c) > > bobb > > > _______________________________________________ > tiling mailing list > tiling@lists.eogeo.org > http://lists.eogeo.org/mailman/listinfo/tiling -- Allan Doyle +1.781.433.2695 adoyle@eogeo.org