DataFlow - easy storage and sharing for research data

  • Increase font size
  • Default font size
  • Decrease font size

Developers Meeting 23 January 2012

 

 

Time - 11:30

Location - Via Skype

Attendees
Richard

Bhavana

Joe

Alex

Katherine

Steph

 

 

Agenda


note: put all tasks/considerations into JIRA so they don't get lost.

1) Release Date

When?

"Next week."

We will have a call this Friday (27 January) to check.

 

Users will be able to download a virtual machine from...??

Bhavana hasn't tried building a VM yet, waiting for updates to code.

Tidy-up from the code audit is not done yet, and some features in DataStage that needed sorting out.  Steph to chase Ben O'Steen about this.  Bhavana to update Devdocs with her latest work.

Alex to meet Bhavana and Anusha for a quick review of what needs doing in Admiralbase, Bhavana to start by making a list in JIRA for Alex to respond to.

 

What will be in it?

It was worth waitng for because:

You can now submit datasets to multiple databanks

Authentication is more lightweight - we don't rely on LDAP configuration in Apache.  We've rebuilt the codebase to be more extensible, and more economical and stable

New User Interface

 

Notes to test users:

1) This is a test system.  May be a bit ropey, please bear with us.

2) Be sure to back up your data, but don't worry, you won't lose it.  (It might not be accessible, will take more work to get it back -- quicker to work from a backup)

3) When it's time to update, ...

We'll need to use update scripts, rather than assuming a clean install with each update.  We had something like this for Admiral, shouldn't be impossible.  But nothing has been done about this yet.  If working with VM, then this should be seamless (ish), but next version will also be installable via debian pacakaging, which would necessitate a fresh install anyway.  Subsequent releases would be sensibly updatable.

4) Easiest mistake to make: If you're using real data, you need to work out the permission system.  If you stick it online, and work out how to restrict it later, you could accidentally expose data that you didn't mean to.


2) Continuous testing

Date for code freeze?

Review it on Friday, and either freeze it Friday or mid-week at latest.

 

What does "continuous testing" mean for this project?  The last successful build was 4 months ago.  We have less than 4 months left in the project.

Partly, this is because we've been tidying up the codebase.  We haven't written any new tests for the system.  It's runnign on the current VM... and the VM isn't being updated.  We can (and will) run more frequent tests when the debian packaging is done.  Weekly, or even nightly, builds to become the norm.

 

3) New JIRA issues from OSS Watch

A. Add licence information to all source files (DATAFLOW-210)
This is crucial for the release; for proper IP due diligence all source files must have the header added to indicate its licence.

Steph to check with Sander about this - we can do it but only do the minimum in this next week.


B. Perform licence and dependency check on source code (DATAFLOW-213)
All the dependencies must be checked for the release on licence compatibility. This must be reported and when we do it it would be good if we can document everything (see DATAFLOW-209)

Everyone needs to confess any code-borrowing.  Steph to ask Sander what form this should take.  Use Doodle or SurveyMonkey to keep track of answers?


4) Roadmap

All done?  In JIRA?  No text or links on the website.

 

5) William Hudson

He's been on standby since November.  When does he start work, and who will help him get started?

Katherine to put him on the case -- start with VM next week.

 

6) Reply to blog post voting DataFlow the best "on paper"

Questions in email to Steph

 

7) Documentation

Katherine, Jo and Steph need to talk to most of you.  Bhavana and Anusha -- when convenient to meet Katherine?

Tuesday morning Katherine meet Bhavana

 

8) SWORD

Richard been working to update the two libraries related to SWORD, along with appropriate tests.  Has ported the basic SWORD server to Pylon. All of the behaviours we expected from SWORD are now working for DataBank.

Need someone to test this now.  Hudson?

 

The authentication process works, but is very weak (allows any old username and password) -- not sure how the handover/authentication layout will eventually work.  OAuth?

 

Met David Shotton in early January.  Concern for scalability for DataStage: no community yet outside of Oxford.  Could SWORD server run over a directory, and DataStage could be a wrapper around that?

Richard to talk to Bhavana about this...