spacer gif
spacer gif

About the Data Portal Application

The Data Portal Application is developed by The Nordic Genetic Resource Center (NORDGEN) and Bioversity International The data portal is develoepd as a generic web application written in PHP using the ADODB database library connecting to a PostgreSQL database system. Some of the routine operations have also been coded for the Perl scripting language, but these Perl scripts are not critical to the operation of the web application by itself. For the web pages are coded as XHTML with the layout defined by CSS (Cascading Style Sheets). The portal web application have been successfully tested with the Apache web server for the Apple OSX, Linux and Windows operating system environment.

The data portal web application is originally based on the SESTO genebank system developed by the Nordic Genetic Resource Center (NORDGEN). Each implementation of the Data Portal is actually only different layout skin of the very same core portal code libraries. Some of the portal implemntations includes the Svalbard Global Seed Vault (SGSV), the CWR Global Portal, the Generation Challenge Program, the Nordic Genetic Resource Center (NORDGEN), ...with more.

The data portal source code is available from the Subversion code repository hosted by the Nordic Genetic Resource Center (NORDGEN).

Software used by or useful to the data portal

Administration of an existing data portal implementation

To set up a new data portal (or manage an existing data portal), you should know the following technologies and software.
  • Apache web server. The data portal is not tested for other web server software, knowledge of the Apache web server software is needed.
  • PostgreSQL database. The portal is implemented for the PostgreSQL database server software. It should be possible to move to another database software compliant with the ADODB database abstraction library, but this has not been tested (yet).
  • ADODB database abstraction library. All database connections are made with the ADODB lbrary. Knowledge of ADODB would be useful if any modifications of the data connection to the database are needed.
  • PHP5. The data portal web application is coded with the PHP5 scripting language. If any modifications are needed you will need to know basic PHP programming.
  • XHTML, JavaScript and CSS. The PHP scripts are used to generate XHTML code with layout defined with CSS (Cascading Style Sheets). If you wish to modify the layout you will need to know the CSS. It is also recommended to know XHTML. Some of the client side features of the data portal is implemented with JavaScript, these include mostly features for loged in users.
  • Perl. The Perl scripting language is used for most of the automatic data import as well as some other maintenance tasks. You may write your own maintenance scripts using another programming language, but if you wish to modify the existing scripts, you will need to know basic Perl.
  • Data import. The data portal is a tool to make summary metadata, or complete datasets searchable and available for the portal web application. These datasets are often provided in many different formats. Some datasets are provided as easy to import SQL data dumps. Other datsets are provided as text files, including CSV (comma separated values), tab-delimited text, or XML data files. Datasets provided as XML based web services are also easy to interact with and often add more advanced features for keeping the dataset updated. If you wish to update datasets presented from the data portal, you will need to know how to extract and import the data you need to the database server. The data portal includes a number of (perl) scripts you may use for these tasks (as described in more details from the portal technical manual).
  • BioCASE, TAPIR, PyWrapper. The PyWrapper web service protocol (database wrapper) is an interface to germplasm datasets gaining popularity. Version 1 and 2 of the PyWrapper protocol was called BioCASE from the EU funded project where the first protocol was developed (at the Berlin Botanical Garden). Version 3 of the PyWrapper protocol is often called TAPIR, and is the first implementation of the TAPIR data exchange protocol. TAPIR is the spesification of the protocol implemented by PyWrapper3, TAPIR is also implemnted by the TapirLink and TapirDotNet database wrapper. If you plan to present datasets from a distributed network of TAPIR database wrapper implementations (like for example the PyWrapper3), then you would need to know these technologies. The GBIF data portal is also compatible with TAPIR, so datasets shared with this protocol could easily be shared with GBIF as well!

To add a new scope to the data portal

  • Choose a scope acronym
  • Create a sub folder using the scope acronym in the portal folder page_elements/[acronym] for the menus and layout structure of your new portal
  • Create a sub folder using the scope acronym in the portal folder html/css/[acronym]/style.css for the CSS style layout of your new portal
  • Create a sub folder using the scope acronym in the portal folder webpages/[acronym] for the information pages of your new portal
  • ALL these scope elements will have a default from the generic portal software, adding content to the scope sub folders as described above will over-ride the dafault
  • Enter the database for your new [scope]:
    • Create a database using the new scope acronym
    • SQL:: CREATE DATABASE [scope] WITH ENCODING='utf8' (...or use other encoding you prefer)
    • ... or configure your new scope to use an existing database using the page_elements/[scope]/settings.phps file
    • You will also need the TABLE data_model to get started...
    • SQL ::
    • CREATE SEQUENCE data_model_data_model_id_seq
    • CREATE TABLE data_model ( data_model_id integer PRIMARY KEY NOT NULL DEFAULT nextval('data_model_data_model_id_seq'), data_concept_id integer, column_name varchar(255), table_name varchar(255), database_name varchar(255), data_type varchar(255), char_max_length integer, is_nullable varchar(5), column_default varchar(255), ordinal_position integer, primary_key boolean, foreign_key_to_table varchar(255), data_title varchar(255), data_description varchar(255), data_unit varchar(255), link_out_url varchar(255), remarks text, creusr character varying(32) DEFAULT CURRENT_USER, credtm timestamp(0) without time zone DEFAULT now(), updusr character varying(32) DEFAULT CURRENT_USER, upddtm timestamp(0) without time zone DEFAULT now() )
    • GRANT SELECT, UPDATE ON data_model_data_model_id_seq TO public
    • GRANT SELECT, UPDATE, INSERT ON data_model TO public
  • For login in to your new scope you will need a database table named person completed with the usernames and password you want to use
  • LOGIN will give access to some additional functions and data with limited access, but as this is a data portal and not a information system, you are referred to the SESTO system for a more complete genebank information system...

The Data Portal is still under development...