2011. július 15., péntek

[KATALIST] Fwd: CERN is hiring developers for INSPIRE

** Apologies for cross posting **

===

CERN Fellowships: text mining scientific documents; author disambiguation in INSPIRE

The CERN Scientific Information Service is looking for two enthusiastic andmotivated developers with experience in text-mining or digital libraries, to join a dynamic international collaboration which is building, enhancing and operating the INSPIRE information service, a digital library which is akey working tool used by 50'000 scientists worldwide in their cutting-edge research in High-Energy Physics. We have two fellowships: the first for the text mining of scientific documents, the second for author disambiguation and management.

What you will do (text mining fellowship):

- Develop and expand our current text-mining of documents to extract all possible metadata: authors, affiliations, references and additional scientific content (figures, tables and more). Build infrastructure to mine inreal time, leveraging user feedback, as scientists share documents, or forbulk mining of large collections of scanned/OCR'ed historical material.
- Integrate, harmonize and expand all steps in the treatment of documents upon ingestion in INSPIRE from multiple sources, from extracting metadata to grabbing figures, from detecting similarities to spotting duplication.
- Explore opportunities in the extraction of the contextual information provided by the location of references, figures and tables in scientifictexts.

What you will do (author disambiguation and management fellowship):

- Expand and develop our author disambiguation and profile-claiming production infrastructure, with the aim to automatically associate every newly accessed document to the correct author profile.
- Extend our author-article algorithmic and crowd-sourced tools to provide assertions about the academic affiliation of scientists
- Assure seamless interoperability and bulk-data exchange with other relevant partners such as NASA-ADS, arXiv.org, ORCID and leading publishersin Physics.

Other things you will do (for both fellowships):

- According to your inclination and abilities, help out on other projects, such as crowdsourcing aspects of digital library curation, integrating our services with other data sources via linked open data, UI/UX design, operations of production and mining of usage data.
- We require limited participation in stand-by duty for hot-fixes in the operation of the INSPIRE web service on evenings, weekends and public holidays.

Your profile:

- You are a citizen of one of the CERN Member states: Austria, Belgium, Bulgaria, the Czech Republic, Denmark, Finland, France, Germany, Greece,Hungary, Italy, Netherlands, Norway, Poland, Portugal, Slovakia, Spain, Sweden, Switzerland and the United Kingdom. Citizens from Romania can now also apply.
- You hold a BSc, MSc or PhD in Computer Science and have less than 10 years professional experience after your highest diploma.
- You understand how scientists communicate and have either a proven track record in handling or mining technical or academic documents, or an experience in author disambiguation in a large-scale digital library.
- You have a solid experience in developing in a LAMP (Linux, Apache,MySQL, Python) stack, preferably in open source projects, using git or similar DVCS, and desirably in a production environment.
- Familiarity with issues and standards in information systems are anasset: XML, XSLT, RSS, OAI-PMH

Who we are:

CERN is the world leading laboratory in High-Energy Physics, home to the record-smashing LHC accelerator. Together with partners at SLAC/Stanford, Fermilab and DESY/Hamburg, The CERN Scientific Information Service and IT teams are building INSPIRE: a digital library serving 1 million records to 50'000 scientists in the field worldwide, which is in beta at http://inspirebeta.net. We collaborate closely with sister infrastructures arXiv at Cornell and the NASA/ADS at Harvard, as well as leading publishers in the field. We are founding members of the ORCID initiative, and stalwarts of Open Access through a myriad projects and initiatives.

What we offer:

- Contract duration: One year, which might be extended for a second year, conditional to performance. Further extension up to a maximum of threeyears can be granted under some circumstances.
- Financial conditions: Fellows stipends are competitive and calculated individually according to age and qualifications, in the range 55'000-85'000 CHF per annum, net. Fellow are entitled to additional family and child allowances. International civil servants in the area are allowed to purchase discounted tax-free vehicles.
- Leave: Fellows are entitled to 2.5 days paid leave per month, plus two weeks at Christmas and a few other local holidays.
- Insurance: Fellows are covered by CERN's comprehensive health scheme for themselves and their dependents.
- Travel expenses: Fellows are entitled to travel expenses for themselves and their family and may be entitled to an installation grant. We alsooffer help with finding suitable accommodation.

How to apply:

Create an account and submit a complete electronic application form at http://bit.ly/oDhSRq , containing your Curriculum Vitae, photocopy of the last(highest) qualification, a short (half page) description of your motivation for coming to CERN and work with INSPIRE, and the names of three referees who will provide us with letters of recommendation. It is your responsibility to arrange for these letters. Please indicate "INSPIRE" in the field "Miscellaneous information: Please give details of the work you are interested in doing at CERN". NOTE that we will not be able to process yourapplication otherwise.

In parallel, it is indispensable that you also send us a copy of your CV athiring@inspirebeta.net

Deadline:

Irrespective of deadlines indicate on the application web page, the application and ALL supporting documents should reach us BEFORE August 10th, 2011.Retained candidates will be interviewed remotely in the second half of August. The two successful candidates will start on October 1st.

Background:

Built on the CERN Open Source Invenio digital library software, and hosting1 million records hand-curated over 40 years by partners at SLAC/Stanford,Fermilab and DESY/Hamburg, INSPIRE serves 1 million records to 50'000 High-Energy Physics researchers worldwide. INSPIRE, in beta athttp://inspirebeta.net, provides fast metadata and full-text searches, author disambiguation, citation analysis, and is expanding its content and services in a community-centric approach, in addition to journal publications and other scientific contents. We anticipate users will soon be submitting scientific documents, and large scale recovery of historical OCR'ed material will take place, with hundreds of thousands of documents from 3 to 300 pages long, which will have to be mined for automatic generation of metadata. Further, we will explore and expand initiatives for figures and tables extraction from the text, as well as contextual information on references.

Further information about the CERN fellowship program is available athttps://ert.cern.ch/browse_www/wd_pds?p_web_site_id=1&p_web_page_id=5834

Further information about the position can be obtained by writing to hiring@inspirebeta.net

_______________________________________________
Katalist mailing list
Katalist@listserv.niif.hu
https://listserv.niif.hu/mailman/listinfo/katalist