10.7 Adding new nodes

The Add button in the main page allows you to add new harvesting nodes. After pressing it, you will reach the page shown in Figure 10.3, “Adding a new harvesting node”. When creating a new node, you have to choose the harvesting protocol implemented by the remote server. The supported protocols are:

Geonetwork This is the standard and most powerfull harvesting protocol used in GeoNetwork. It is able to log in into the remote node, to perform a standard search using the common queryable fields and to import all matching metadata. Furthermore, the protocol will try to keep both remote privileges and categories of the harvested metadata if they exist locally. Please notice that since GeoNetwork 2.1 the harvesting protocol has been improved. This means that it is not possible to use this protocol to harvest from version 2.0 or below. Web DAV This harvesting type uses the web DAV (Distributed Authoring and Versioning) protocol to harvest metadata from a DAV server. It can be useful to users that want to publish their metadata through a web server that offers a DAV interface. The protocol allows to retrieve the contents of a web page (a list of files) with their change date. CSW CSW stands for Catalogue Services for the Web and it is a search interface for catalogues developed by the Open Geospatial Consortium. GeoNetwork is compatible with version 2.0.1 of such protocol. Old geonetwork GeoNetwork 2.1 introduced a new powerfull harvesting engine which is not compatible with the old one present in GeoNetwork 2.0. Old 2.0 servers can still harvest from 2.1 servers but a 2.1 server needs this harvesting type to harvest from old 2.0 servers. This harvesting type is deprecated and only kept until GeoNetwork 2.1 will be widely spread. OAI-PMH The acronym stands for Open Archive Initiative Protocol for Metadata Harvesting. It is a good harvesting protol widely used. GeoNetwork is compatible with version 2.0 of the protocol.

The drop down list shows all available protocols. Pressing the Add button you will reach an edit page whose content depends on the choosed protocol. The Back button will go back to the main page.

Figure 10.3. Adding a new harvesting node

Adding a new harvesting node

Adding a GeoNetwork node

This type of harvesting allows you to connect to a GeoNetwork node, perform a simple search as in the main page and retrieve all matched metadata. The search is usefull because it allows you to focus only on metadata of interest. Once you add a node of this type, you will get a page like the one shown in Figure 10.4, “Adding a GeoNetwork node”. The meaning of the options is the following:

Figure 10.4. Adding a GeoNetwork node

Adding a GeoNetwork node

Site Here you put information about the GeoNetwork’s node you want to harvest from (host, port and servlet). If you want to search protected metadata you have to specify an account. The name parameter is just a short description that will be shown in the main page beside each node. Search In this section you can specify search parameters: they are the same present in the main page. Before doing that, it is important to remember that the GeoNetwork’s harvesting can be hierarchical so a remote node can contain both its metadata and metadata harvested from other nodes and sources. At the beginning, the Source drop down is empty and you have to use the Retrieve sources button to fill it. The purpose of this button is to query GeoNetwork about all sources which it is currently harvesting from. Once you get the drop down filled, you can choose a source name to constrain the search to that source only. Leaving the drop down blank, the search will spread over all metadata (harvested and not). You can add several search criteria for each site through the Add button: several searches will be performed and results merged. Each search box can be removed pressing the small button on the left of the site’s name. If no search criteria is added, a global unconstrained search will be performed. Options This is just a container for general options.

Every This is the harvesting period. The smallest value is 1 minute while the greatest value is 100 days. One run only If this option is checked, the harvesting will do only one run after which it will become inactive. Privileges Here you decide how to map remote group’s privileges. You can assign a copy policy to each group. The Intranet group is not considered because it does not make sense to copy its privileges. The All group has different policies from all the others:

  1. Copy: Privileges are copied.

  2. Copy to intranet: Privileges are copied but to the Intranet group. This way public metadata can be made protected.

  3. Don’t copy: Privileges are not copied and harvested metadata will not be publicly visible.

For all other groups the policies are these:

  1. Copy: Privileges are copied only if there is a local group with the same (not localized) name as the remote group.

  2. Create and copy: Privileges are copied. If there is no local group with the same name as the remote group then it is created.

  3. Don’t copy: Privileges are not copied.

On the bottom side of the page there are some buttons:

Back Simply return to the main harvesting page. Save Saves the current node information and returns to the main harvesting page. When creating a new node, the node will be actually created only when you press this button.

Adding a Web DAV node

In this type of harvesting, metadata are retrieved from a remote web page. The available options are shown in Figure 10.5, “Adding a web DAV node” and have the following meaning:

Figure 10.5. Adding a web DAV node

Adding a web DAV node

Site Here are the connection information. The available options are:

Name This is a short description of the node. It will be shown in the harvesting main page. URL The remote URL from which metadata will be harvested. Each file found that ends with .xml will indicate a metadata and will be retrieved, converted into xml and imported. Icon Just an icon to assign to harvested metadata. The icon will be used when showing search results. Use account Account credentials for a basic HTTP authentication toward the remote URL. Options General harvesting options:

Every This is the harvesting period. The smallest value is 1 minute while the greatest value si 100 days. One run only If this option is checked, the harvesting will do only one run after which it will become inactive. Validate If checked, the metadata will be validate during import. If the validation does not pass, the metadata will be skipped. Recurse When the harvesting engine will find folders, it will recursively descend into them. Privileges Here it is possible to assign privileges to imported metadata. The Groups area lists all available groups in GeoNetwork. Once one (or more) group has been selected, it can be added through the Add button (each group can be added only once). For each added group, a row of privileges is created at the bottom of the list to allow privilege selection. To remove a row simply press the associated Remove button on its right. Categories Here you can assign local categories to harvested metadata.

At the bottom of the page there are the following buttons:

Back Go back to the main harvesting page. The harvesting is not added. Save Saves node’s data creating a new harvesting node. Then it will go back to the main harvesting page.

Adding a CSW node

This type of harvesting is capable of connecting to a remote CSW server and retrieving all matching metadata. Please, note that in order to be harvested metadata must have one of the schema format handled by GeoNetwork. Figure 10.6, “Adding a Catalogue Services for the Web harvesting node” shows the options available, whose meaning is the following:

Figure 10.6. Adding a Catalogue Services for the Web harvesting node

Adding a Catalogue Services for the Web harvesting node

Site Here you have to specify the connection parameters which are similar to the web DAV harvesting. In this case the URL points to the capabilities document of the CSW server. This document is used to discover the location of the services to call to query and retrieve metadata. Search Using the Add button, you can add several search criteria. You can query only the fields recognized by the CSW protocol. Options General harvesting options:

Every This is the harvesting period. The smallest value is 1 minute while the greatest value si 100 days. One run only If this option is checked, the harvesting will do only one run after which it will become inactive. Privileges Please, see web DAV harvesting. Catagories Please, see web DAV harvesting.

At the bottom of the page there are the following buttons:

Back Go back to the main harvesting page. The harvesting is not added. Save Saves node’s data creating a new harvesting node. Then it will go back to the main harvesting page.

Adding an OAI-PMH node

An OAI-PMH server implements a harvesting protocol that GeoNetwork, acting as a client, can use to harvest metadata. If you are requesting the oai_dc output format, GeoNetwork will convert it into its dublin core format. Other formats can be harvested only if GeoNetwork supports them and is able to autodetect the schema from the metadata. Figure 10.7, “ Adding an OAI-PMH harvesting node ” shows all available options, which are:

Figure 10.7.  Adding an OAI-PMH harvesting node

Adding an OAI-PMH harvesting node

Site All options are the same as web DAV harvesting. The only difference is that the URL parameter here points to an OAI-PMH server. This is the entry point that GeoNetwork will use to issue all PMH commands. Search This part allows you to restrict the harvesting to specific metadata subsets. You can specify several searches: GeoNetwork will execute them sequentially and results will be merged to avoid the harvesting of the same metadata. Several searches allow you to specify different search criteria. In each search, you can specify the following parameters:

From You can provide a start date here. All metadata whose last change date is equal to or greater than this date will be harvested. You cannot simply edit this field but you have to use the icon to popup a calendar and choose the date. This field is optional so if you don’t provide it the start date constraint is dropped. Use the icon to clear the field. Until Works exactly as the from parameter but adds an end constraint to the last change date. The until date is included in the date range, the check is: less than or equal to. Set An OAI-PMH server classifies its metadata into hierarchical sets. You can request to return metadata that belong to only one set (and its subsets). This narrows the search result. Initially the drop down shows only a blank option that indicate no set. After specifying the connection URL, you can press the Retrieve Info button, whose purpose is to connect to the remote node, retrieve all supported sets and prefixes and fill the search drop downs. After you have pressed this button, you can select a remote set from the drop down. Prefix Here prefix means metadata format. The oai_dc prefix is mandatory for any OAI-PMH compliant server, so this entry is always present into the prefix drop down. To have this drop down filled with all prefixes supported by the remote server, you have to enter a valid URL and press the Retrieve Info button.

You can use the Add button to add one more search to the list. A search can be removed clicking the icon on its left. Options Most options are common to web DAV harvesting. The validate option, when checked, will validate each harvested metadata against GeoNetwork’s schemas. Only valid metadata will be harvested. Invalid one will be skipped. Privileges Please, see web DAV harvesting. Categories Please, see web DAV harvesting.

At the bottom of the page there are the following buttons:

Back Go back to the main harvesting page. The harvesting is not added. Save Saves node’s data creating a new harvesting node. Then it will go back to the main harvesting page.

Please note that when you edit a previously created node, both the set and prefix drop down lists will be empty. They will contain only the previously selected entries, plus the default ones if they were not selected. Furthermore, the set name will not be localized but the internal code string will be displayed. You have to press the retrieve info button again to connect to the remote server and retrieve the localized name and all set and prefix information.


Other documents: The complete manual in pdf format | License | Readme | Changes