-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use w3c xml.xsd from Darwin Core repository #134
Comments
This would affect docs/text/index.md and https://dwc.tdwg.org/text/tdwg_dwc_text.xsd. I don't see a copy of xml.xsd on rs.tdwg.org. We'd need that to resolve this issue. |
See also issue #124 |
If this is a critical file to the functioning of Darwin Core, then it should be considered to be part of the standard itself. If that is true, then I believe the appropriate course of action would be to treat it in the same manner as other standards documents: assign it an IRI in the rs.tdwg.org subdomain following the IRI pattern scheme for documents. Using the analogous patterns to those of the TAPIR XML schema and ABCD XML schemas listed in the file that defines document redirects, I would recommend:
with a
The behavior of dereferencing this "permanent" IRI can be seen by dereferencing the TAPIR schema "permanent" IRI http://rs.tdwg.org/tapir/doc/xmlschema/ using cURL with
The first response is the appropriate Linked Data behavior -- if an RDF Although this extra HTTP GET is a bit kludgy, it seems to work fine and in the end produces the correct document with the correct I think that this general approach (using a standard IRI pattern and redirecting from a permanent IRI to a redirect URL) is the right one and what we should do from this point forward. We need to stop having people use idiosyncratic URLs that break every time we change delivery system and start getting people to use actual stable IRIs to access resources. The other thing is that every file that is a critical part of a standard needs to be included in the metadata for that standard so that at least in theory a machine or human can determine all of the parts of a standard. We are not quite there yet, but following systematic patterns for IRIs of standards components is a piece of that. It also avoids us having to have a long list of custom redirects every time a critical file moves to some other place. I can easily and quickly set up the entry in file that defines document redirects using the IRI |
I wholeheartedly support the proposed solution. Others? @mdoering @peterdesmet @MattBlissett ? |
No objections |
I support @baskaufs proposal. But it does not seem to address the original issue about the https://dwc.tdwg.org/text/tdwg_dwc_text.xsd still imports xml.xsd from a gbif URL:
|
If a "permanent" IRI is implemented, can we just change the |
Technically I would strongly avoid using redirects. The IRI should return an http 200, no 3xx as many default implementations including Java fail to deal with that. See https://stackoverflow.com/questions/29696638/how-to-validate-xml-with-schema-urls-that-return-http-301 or https://planet.jboss.org/post/java_7_xml_entity_resolver_doesn_t_follow_redirects_makes_xsd_validation_fail |
That's good to know, @mdoering. It sounds like the best solution for the XML schemas is to set up the content negotiation part of the script to serve requests for XML files without redirects. Do you know if most clients actually send a request header of I think this could be implemented relatively easily by keeping a list on GitHub of the locations where the few XML schemas are located, having the server script load that list, then pull the file from wherever it is and serve it to the client with a 200. I do something like that here to determine whether particular terms should redirect to a web page or if the script should generate a page from data. That would require loading two files from GitHub the first time each XML file was requested, but I think that @MattBlissett has the server set up to cache requests for some period of time. So after the first request, clients should just get the file from the cache and that should be pretty efficient. That approach would allow changing the location of the file by just changing an entry in a table in GitHub and not actually making any changes to the server itself. I don't think I have the bandwidth to deal with this now, so maybe it can be put off until I have time to work on the script and test carefully. |
Looks like this is what javax.xml.validation requests:
Rather unspecific... |
Wow. Not that great. I should have said |
@tucotuco Maybe we can back-burner this for the time being. |
I agree. I think the highest priority is to move forward requests for new
terms and term changes, then web site fixes, then the rest.
…On Thu, Sep 10, 2020 at 10:28 AM Steve Baskauf ***@***.***> wrote:
@tucotuco <https://github.com/tucotuco> Maybe we can back-burner this for
the time being.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#134 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADQ723CGO3A7ZEWC2YYTGDSFDH6LANCNFSM4CWY6KQQ>
.
|
The original issue:
The rs.gbif.org server is about a metre from the rs.gbif.org server, and significantly further from GitHub's servers. Communication between them is very reliable, and external availability is likely to be the same. Unless there's some other reason, I also agree this is very low priority. |
Keeping at low priority and removing from current milestone in the interest of releasing the new terms. |
The DwC text xml schema references the W3C xml.xsd from a GBIF server. It should better reference a copy from the DwC repository instead to not rely on the external GBIF URL http://rs.gbif.org/schema/xml.xsd
The text was updated successfully, but these errors were encountered: