Replies: 20 comments 8 replies
-
Most users whose data I've interacted with have used shapefiles for vectors, and pretty much all vector data that I've downloaded from anywhere is also in shapefile format. I only use shapefiles, partly because I'm lazy, partly because ArcGIS still doesn't handle geopackages very well. There's a lot of inertia around shapefiles, not that this is necessarily a compelling reason to keep supporting them (because they do have a lot of annoying quirks that geopackages do not), but I suspect that we'd get pushback from users if we stop. Is there a way to do a poll on the user forum? It would be interesting to hear users' take on it. |
Beta Was this translation helpful? Give feedback.
-
My only "con" about GPKG is that when you open them in QGIS, because of the way things are handled behind the scenes, a write occurs which updates the file timestamp. This is wicked annoying when checking results for large data when using Taskgraph.
Definitely doable, but when Dave ran a poll via the forums in the past I don't think we saw many responses... |
Beta Was this translation helpful? Give feedback.
-
Adding @cybersea to this conversation! |
Beta Was this translation helpful? Give feedback.
-
I'm all for moving away from shapefiles, but my inclination would be to phase them out by not using them in new models. Or replacing them during large refactors. Mainly for the reasons Stacie mentioned and for the possibility of introducing bugs in models that we don't otherwise touch very often. |
Beta Was this translation helpful? Give feedback.
-
I prefer geopackage because it is a single file, can use standard SQL queries, can store multiple files, doesn't have a field length constraint, (and probably some more things I'm not thinking of at the moment). I think it would be ideal to push towards this format. But agree with @newtpatrol that there is still a LOT of usage and inertia around it and I don't see that changing anytime soon, despite many people's efforts to do so. Is it possible to give the user of the option of shapefile or geopackage output? With geopackage as the default? |
Beta Was this translation helpful? Give feedback.
-
If you do use geopackage -- would it be possible to create a single geopackage with multiple layer outputs from the models? Since that's a nice benefit of that format. |
Beta Was this translation helpful? Give feedback.
-
I also have issues with In theory, I like the idea of polling the community, but I do not imagine we'd garner many responses if it's just on the forum for people to stumble upon. Do we have the email addresses of all forum participants and, if so, could we email everyone a poll? I think we'd get way more responses that way. |
Beta Was this translation helpful? Give feedback.
-
That's an idea I haven't thought of before. |
Beta Was this translation helpful? Give feedback.
-
Do people have a way of opting in to receiving emails? If not, I'd be careful about sending direct emails to people. Personally, I immediately get off of the mailing list for every entity that blasts me without my having explicitly opted in. Here's another totally stream-of-consciousness thought. Along with the forum, comms sends out a regular NatCap newsletter. I wonder if it would be effective to create a Google (or whatever) poll, and advertise it through both the forum and newsletter. Perhaps this isn't important enough to warrant that level of effort though. |
Beta Was this translation helpful? Give feedback.
-
We can see the emails, but there is no mass email feature and I'm pretty sure that is by design. Many people interact with the forum via email, but that's because they opted-in to receive emails triggered by posts. Many more people probably signed up for the forum and never went back to visit it; I wouldn't to email those people. Especially because the email would have to be sent from one of us, not from "The NatCap Forum". I'm not sure the newsletter gets to the right audience (I don't really know its audience). A forum poll never hurts, even if there are very few responses. A choice of output file format might be a nice way to ease the transition for people attached to one format or the other. It would be a heavy-ish lift for development though. And I'm not sure we have much to lose by just picking the format we think is best and imposing our will (with whatever transition plan we think makes sense). |
Beta Was this translation helpful? Give feedback.
-
Another reason to stop using shapefiles: Logs filled with warnings like this:
see also #1023 |
Beta Was this translation helpful? Give feedback.
-
I always wondered what was causing those warnings, but have mostly ignored them. Why exactly do they occur? If it's related to field data that InVEST is creating, I'd think that InVEST would add a field of appropriate size so there's no problem. If it's related to field data that the user already had in the shapefile, I still don't understand where the warning comes from, since obviously the value fit just fine in the user-supplied shapefile, else it wouldn't be there to write to the output. Does InVEST somehow create a new field of a different size where the value no longer fits? If so, why wouldn't it create a field of the correct size? |
Beta Was this translation helpful? Give feedback.
-
Great questions. It's related to fields that are already in the input data that invest doesn't know about, but nevertheless wants to preserve. Like if invest just reprojects an input, it's creating a new vector and preserving all the fields. And there are other cases where we copy an input to an intermediate/output file before modifying it. What I don't understand, is when the input data seems to have well-defined width & precision for a field, why does GDAL not use that metadata when it creates the same field for the new vector? https://trac.osgeo.org/gdal/ticket/6803 This issue suggests the warning would only occur "when there's no explicit width/precision in the source dataset." That would make sense...it's too costly to infer the correct size from the data itself. But it seems to also occur when there is an explicit definition in the source 🤷 |
Beta Was this translation helpful? Give feedback.
-
See also @phargogh comment explaining these shapefile field warnings here: #1023 |
Beta Was this translation helpful? Give feedback.
-
I've run into trouble when trying to add new fields to GPKGs. This is much easier to do (possible) with SHPs. |
Beta Was this translation helpful? Give feedback.
-
I've stumbled upon another reason NOT to use geopackages. Width and precision for numeric fields are not part of the GeoPackage standard. This is a major limitation. If someone can show me how to add new fields to an existing geopackage and/or how to set numeric scale and precision, please LMK! |
Beta Was this translation helpful? Give feedback.
-
@jagoldstein I can't speak to adding fields in ArcGIS, but adding a field to a GPKG is super easy in QGIS:
|
Beta Was this translation helpful? Give feedback.
-
thanks @phargogh . It's good to hear that this is easy in QGIS. It's definitely ArcPro that's giving me trouble trying to edit (including adding fields) to gpkgs. There might be a learning curve that I haven't gotten past yet, but generally, the ESRI software seems more compatible with shps, which isn't surprising. |
Beta Was this translation helpful? Give feedback.
-
@jagoldstein you're ahead of me, since I can't figure out how to add a field at all to a geopackage output by InVEST in ArcPro, it calls the file "read only". And the .gpkg output by UNA can't even be viewed in ArcDesktop, it doesn't like the dash in the layer name Commune-popgroup. (I should have found that during beta testing!) While we might get away with ignoring ArcDesktop since it's on its way out, we need to make sure that either things work in ArcPro, or we at least have a workaround for it to recommend. I might spend some time on this "read only" thing, although the workaround might just be exporting to shapefile. |
Beta Was this translation helpful? Give feedback.
-
My takeaway from this discussion so far is that the Shapefile format has a lot of momentum and history, thus is what most folks are comfortable using. Shapefiles certainly have their issues as pointed out above, but GPKG might be a burden for end users as it's still an up and coming format and not greatly supported by ArcGIS. This makes me think we don't necessarily want to phase out Shapefiles. I'd rather we take up the burden of handling multiple vector formats then pushing that burden to the user (both internal and external users). And I think having models output different vector formats is confusing to the user and so, my inclination is to focus on consistency of what InVEST is choosing to implement. I propose:
|
Beta Was this translation helpful? Give feedback.
-
I've heard some good arguments against shapefiles (http://switchfromshapefile.org/). I haven't heard any arguments for shapefiles (but if there are any shapefile defenders out there, please speak up!) @newtpatrol @jagoldstein do you have a preference? Have you come across any GIS tools that only work with shapefiles?
benefits
drawbacks
tasks
gdal.GetDriverByName('ESRI Shapefile')
and.shp
Beta Was this translation helpful? Give feedback.
All reactions