Monday 21 December 2009

Sizing systems based on very little input data

Are there any tried and tested methods of sizing a replacement JavaEE-based system based on typical non-functional requirements (e.g. Active User Session Capacity, Response-Times) coupled to a view of business volumes in transaction-per-minute from, say a DECS/Alpha system?

I was working through this for a couple of days at the end of last year and the point I struggle was the normalisation of one legacy transaction profile (going through terminal session transaction data) into an entirely different Java-based architecture where the end-state transaction and benchmarks will be very different. Does everyone use approximate gearing-factors (e.g. 10x a business transaction) to get to a tpm-C number or are there better approaches to size replacement systems such as using existing system benchmarks like SAP-SD benchmarks. Clearly, the only way is to load-test and profile but what happens if you've only got 1/2-day; you're doing this pre-contract before a line of code can be cut and you barely know what the target architecture will be?

Happy to work through the approach I took but interested in your thoughts.

Monday 29 June 2009

JavaOne 2009 - Slides Available

My JavaOne talk in the SOA Platform and Middleware Services track has been published online. You can get the PDF here but the other link should give you the MPEG when it is released.

Friday 5 June 2009

Thank you JavaOners!

Quick message to those who saw my talk today at JavaOne: thanks for coming and hope you got something out of it - I will be using this blog to post-up more details but feel free to give me feedback or ask questions.

Medical Systems using Glassfish / Open ESB

I attended a BOF at JavaOne last night on:

Medical Instrument Systems Middleware – An ESB based SOA Solution
BOF-4738
Haridas Puthiyapurayil, Senior Software Engineer, Abbot Diagnostics (R&D Hematology)

Where Haridas showed his solution for deploying an ESB for integrating laboratory systems (through RS-232 interfaces) to other services which provide patient details and other clinician systems. The context of this presentation is similar to something I've been working on in the UK for the past month and so it was good to learn from someone in the field.

Here's a few notes I jotted-down, hope these are useful to you. Found it interesting to note that this is the second talk about using Glassfish ESB and App Server in a healthcare enviornment - must be cost reasons and the availability of HL7 encoding methinks.

Background
How can use SOA for building systems without touching (changing the interfaces to) existing medical systems? The objective was to build a highly modular middleware system for integrating and exchanging instrument analysis to the labs

A clinical lab is about:
o many instrument vendors
o many software providers
o many clinical service providers

Needed a framework for patient record result exchange based on -> (predict, diagnose, monitor) and supporting interconnections among these components, i.e. instruments, diagnostics, labs, diagnostics, patients, doctors. Controlled by the FDA and other regulation services.

Where is the complexity?
  • multiple instrument systems from several vendors with their own middleware solutions
  • multiple standards
  • systems run legacy applications
  • integration issues
  • healthcare systems are complex
Core component is an Instrument Binding Component which they wrote as a JBI module and JAXB (for v2.0, see JSR222) - incidentally as you know JAXB is quite complex and so look here for a simpler view.

Key standard was LIS which integrates to a number of LIS systems. I was surprised that this wasn't HL7 but I suppose that just isn't used so frequently by instrument systems.
Architecture wasn't too complex (on the face of it) and was based on Glassfish (Open ESB and App Server). Although, they didn't mention how the small matter of patient record database was maintained/controlled.

The standard is difficult to google and so try this: NCCLS-LIS2-A2 – appears to be a loose specification for transferring information between clinical laboratory instruments and information systems. I say loose as there is no schema standard definition for this standard (!) which apparently can be quite typical.

To build the Encoder:
1) Mapped the schema definitions from vendor and the standard together...
i.e. field elements contained vendor specific elements.
2) Fed the pipe-limited file and the LIS2 into an XJC compiler for JAXB; binding component then puts it on the bus.

A useful presentation from Haridas of Abbott systems of using an ESB in the real-world to solve a problem. A nice talk.

Wednesday 3 June 2009

Distributing builds using Hudson PXE

I attended a really good talk (TS-5301) today at JavaOne 2009 from Kohsuke and Jesse; about running Hudson in the Cloud. See their wiki for upcoming meet-ups.

We've been using Hudson at our company more recently (quick thanks to Aleksey Maksimov for setting up our project quick-start!) and so it's my first look Hudson - coming from a Cruisecontrol background - but suffice to say, I'm really impressed!

Hudson PXE enables environments to be configured in a Master and deployed to Slaves. You can essentially setup an Unbuntu Server as a slave; get the jdk version you want, get a container such as Glassfish (and whatever) and another slave running Redhat with Apache HTTP etc. The only slave requirement is a 170kb JAR; you can even run Windows slaves, all the master needs is a hostname. The key thing is this is all automated for you; configure it once and then let the slaves be provisioned according to your policy.

They have really thought about network segregation. What I mean is, the problem with a Master - Slave when there are firewalls between them - is that the initiator of the connection is now important. Either initiation model is possible with Hudson as the Slave can use JNLP to initiate the connection back to the Master - having 'clicked-once' Hudson will let you set that up as a service for Windows slaves.

Having established a cluster we now get a new set of challenges related to the state of those nodes and how to manage the distribution of jobs to slaves. The latter part is solved in Hudson using Labels meaning that we can use friendly terms like Ubuntu to relate to n number of Slave configurations/distributions. In the former, Hudson actively monitors a number of Slave parameters such as disk-free, memory, tmp free and network. One of the issues we frequently get with remote builds even with a couple of machines is whether we've filled-up /var with old source. Here, Hudson will keep a watch on these things and report in the Hudson console. Hudson will also help clean-up any runaway processes after a build and show you load statistics...

By intercepting load statistics, we can let Hudson setup additional Slaves in the cloud. Hudson GUI supports AWS/EC2 and at 10c/hour we have a really cost effective way of dealing with the 5.30pm spikes we frequently get for the last builds of the day. There was a great demo showing how this was done and so hopefully you just check out the slides/mp4 for more details.

Whilst there are great advantages to this (cheap, deal with spikes, no need to tear-down instances properly) the authors point to problems such as the check-out time; tests may require behind firewall connections to systems and (most importantly the presenter states) the lack of multi-threaded and cross-VM unit testing framework.

You can get round some of the problems by perhaps running your master in the Cloud but you'll now be paying for EBS as well (amazon's storage: Elastic Band Storage).

Final thoughts (one that I'll be looking at again) is that they talked about:
1) Using Hudson Hadoop (their are AMIs for Hadoop already on Amazon) and that way you can turn every Hudson Slave into a Hadoop node - thereby provisioning on the fly based on load requirements. This gives options in terms of future directions for Hudson in terms of analysis of old code artifacts (are we getting better over time?)
2) Hudson Selenium Plug-in. So you can now have a grid of Selenium slaves each running different platform/browser configs all of which talk to the Selenium Hub and report their tests. Sounds a great opportunity for dev shops to save money here!

I have to thank the guys here and my article is a reasonable (if rather newbie) summary of the great work that's been put in..
See for more details or go along to BOF-5105.