Create a Plate Set

  1. Click the ‘Projects’ hyperlink on the global navigation pane. Navigate into the project that will contain the plate set. Note that the ‘Plate Set For…’ link now indicates the default working project.

  2. Select the check box near the plate set into which you would like to add a new plate set. Click the tools icon and select add to project…

  1. Fill in the required information and click submit.
Item Notes
Type Choose a descriptive type that can later be used for sorting
Layout Check the Layout viewer to learn about the different layout options

Available layouts are presented in the dropdown appropriate to the plate format/type:

Once complete, the new plate set will be visible in the client window.

Features

LIMS*Nucleus falls into the “systems” category of LIMS development. LIMS*Nucleus contains a limited set of features and is designed to be integrated into a larger system.

Goal

Comprehensive but limitted feature set (multi-well plate sample management) with well defined inputs and outputs designed to be incorporated into a larger system.

Implemented features

Create project, plate set, plates, wells with and without samples
Group plate sets into a new plate set
Subset plates from a plate set into a new plate set
Reformat plates from 96 into 384 well plates and 384 into 1536 well plates
Apply assay data to plate sets
Visually or algoritnmlically evaluate assay data
Identify hits as samples surpassing a threshold; create hit lists
Create worklists for liquid handling robots
Rearray hits from a hit list into a new plate set
Apply user generated barcodes to plates
Apply user generated accession IDs to samples

Next>> Canonical workflow

Monoliths

LIMS (Laboratory Management Information Systems) can be broadly characterized into 2 groups, monoliths and systems. The difference is less about functionality and more about architecture. Monoliths are a large all inclusive application that maximize automation and minimizes user intervention. Monoliths are very efficient when a process is standardized and unchanging.

Advantages

  • Full automation, maximum reduction in FTE requirements
  • Consistant reproducible processing
  • Enhancements, upgrades, and training outsourced to the vendor
  • User groups provide resources for problem solving (bug fixes, add on components, help with problems)

Disadvantages

  • Cost
  • Many moving parts (database, ORM, web server, interface)
  • Complex - requires extensive training
  • Feature creep
  • Brittle - difficult to change in response to a changing process
  • Dependant on vendor for bug fixes and upgrades
  • Off-the-shelf solutions may not satisfy all requirements
  • May depend on obscure components (old programming languages, object database, image)
  • Custom solutions may be obsolete on delivery
  • Resistance to use

Next>> Systems

Create a Project

Only administrators can create/edit/delete projects. Project is the only entity that can be deleted. It will be deleted using a cascading delete, removing all plate sets, plates, samples, assay runs, and data that are part of the project. Access the “Add Project” menu item under the Admin menu - visible only to administrators:

A dialog will appear with 2 fields, name and description. Pressing OK will add to and refresh the project list in the main browser.

Edit Project

Select the project to be edited, then from the menu bar select Admin/project/edit. The edit project dialog box will be pre-populated with the selected project name and description. Make changes and save.

Delete Project

Select the project to be deleted and from the menu bar select Admin/project/delete. The delet project dialog box will be presented for confirmation,

Reformat Overview

Reformat is an operation performed on plate sets. Reformat will collapse four 96 well plates into a single 384 well, providing 1 replicate per sample. It is also possible to generate duplicates or quadruplicates of each sample. The plate layout of a reformatted destination plate is predetermined by the layout of the source plate, with some examples shown below. Source plates are loaded into destination plates in plate set order, following the Z pattern through the quadrants.

96 well source plate 384 well destination plate
Singlecates
Duplicates
Quadruplicates

The same patterns apply when reformatting 384 well plates into 1536

Replication

LIMS*Nucleus handles five combinations of replication for samples and targets. In the table below, the pattern columns show the distribution of samples and target in the four quadrants of a 384 or 1536 well plate, where each color represents a unique sample (or target).

Label Sample pattern Sample replication Target Pattern Target replication N targets N rep (Assay)
1S4T unique quadruplicates 1 1
2S2T duplicates duplicates 2 1
2S4T duplicates quadruplicates 1 2
4S1T quadruplicates unique 4 1
4S2T quadruplicates duplicates 2 2

As an example consider an assay of eight 96 well plates that will be reformatted into 384 well plates for assay:

1S4T: Eight 96 plates will be reformatted into two 384 well plates (Four 96 per 384) with a unique plate in each quadrant. The 384 plates is coated with the same target in all four quadrants. There is only one target (N Targets) and each sample gets assayed 1 time (N rep (Assay))

2S2T: Eight 96 well plates will be reformatted into four 384 well plates (two 96 per 384) with one sample plate replicated in quads 1,2 and a second replicated in quads 3,4. The 384 well plates are coated with 2 antigens, antigen A in all odd columns and antigen B in all even columns. There are two targets (antigens A and B - N targets) and each sample is assayed once on each target, (N rep (Assay)). Because there are twice as many antigens being assayed as in the example above, twice as many plates are needed.

4S2T: Eight 96 well plates will be reformatted into eight 384 well plates (one 96 per 384) with one sample plate replicated in all four quadrants. The 384 well plates are coated with 2 antigens, antigen A in all odd columns and antigen B in all even columns. There are two targets (antigens A and B - N targets) and each sample is assayed twice on each target, (N rep (Assay)).

Some combinations are avoided

For example these are not an option.

Label Sample pattern Sample replication Target Pattern Target replication N targets N rep (Assay)
1S2T unique duplicates 2 1
1S1T unique unique 4 1

1S2T: Half the source plates are tested on one antigen, the other half on a different antigen. This is better broken into 2 assays, one assay per antigen, or 2 X [ 1S4T ]

1S1T: Convert to 4 X [ 1S4T ], one assay per antigen.

Simplifying Assumptions

Samples are always replicated horizontally ( quads 1,2 and 3,4) Targets are always replicated vertically ( quads 1,3 and 2,4)

Sequence evaluation

When processing sequences obtained from a vendor, it is useful to have an idea of how well the sequencing reactions worked, both in an absolute sense and relative to other recently obtained sequences in the same project. What follows is a primary sequence independent method of evaluating a collection (i.e. and order from an outside vendor) of sequences.

The first step is to align sequences by nucleotide index (ignoring the actual sequence). Start by reading the sequences into a list. I use the list s.b to hold forward (5’ to 3’) sequences, and the list s.f to hold the reverse (but in the 5’ to 3’ orientation, as sequenced) sequences:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
rm(list=ls(all=TRUE))
library(seqinr)

working.dir <- "B:/<my-working-dir>/"


back.files <- list.files( paste(working.dir, "back/", sep="" ))
for.files <- list.files( paste(working.dir, "for/", sep="" ))

> back.files[1:20]
[1] "MBC20120428a-A1-PXMF1.seq" "MBC20120428a-A10-PXMF1.seq"
[3] "MBC20120428a-A11-PXMF1.seq" "MBC20120428a-A12-PXMF1.seq"
[5] "MBC20120428a-A2-PXMF1.seq" "MBC20120428a-A3-PXMF1.seq"
[7] "MBC20120428a-A4-PXMF1.seq" "MBC20120428a-A5-PXMF1.seq"
[9] "MBC20120428a-A6-PXMF1.seq" "MBC20120428a-A7-PXMF1.seq"
[11] "MBC20120428a-A8-PXMF1.seq" "MBC20120428a-A9-PXMF1.seq"
[13] "MBC20120428a-B1-PXMF1.seq" "MBC20120428a-B10-PXMF1.seq"
[15] "MBC20120428a-B11-PXMF1.seq" "MBC20120428a-B12-PXMF1.seq"
[17] "MBC20120428a-B2-PXMF1.seq" "MBC20120428a-B3-PXMF1.seq"
[19] "MBC20120428a-B4-PXMF1.seq" "MBC20120428a-B5-PXMF1.seq"
>

Next determine the number of files read and create a list of that length to hold the sequences. Then read them in and inspect a sequence:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
s.b <- list()
length(s.b) <- length(back.files)
s.f <- list()
length(s.f) <- length(for.files)

for(i in 1:length(back.files)){
s.b[[i]] <- read.fasta(paste( working.dir, "back/", back.files[[i]], sep=""))
}

for(i in 1:length(for.files)){
s.f[[i]] <- read.fasta(paste( working.dir, "for/", for.files[[i]], sep=""))
}

> getSequence(s.b[[2]])[[1]][1:200]
[1] "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "c" "n" "n" "n"
[19] "g" "t" "c" "c" "a" "c" "t" "g" "c" "g" "g" "c" "c" "g" "c" "c" "a" "t"
[37] "g" "g" "g" "a" "t" "g" "g" "a" "g" "c" "t" "g" "t" "a" "t" "c" "a" "t"
[55] "c" "c" "t" "c" "t" "t" "c" "t" "t" "g" "g" "t" "a" "g" "c" "a" "a" "c"
[73] "a" "g" "c" "t" "a" "c" "a" "g" "g" "c" "g" "c" "g" "c" "a" "c" "t" "c"
[91] "c" "g" "a" "t" "a" "t" "t" "g" "t" "g" "a" "t" "g" "a" "c" "t" "c" "a"
[109] "g" "t" "c" "t" "c" "c" "a" "c" "t" "c" "t" "c" "c" "c" "t" "g" "c" "c"
[127] "c" "g" "t" "c" "a" "c" "c" "c" "c" "t" "g" "g" "c" "g" "a" "g" "c" "c"
[145] "g" "g" "c" "c" "g" "c" "c" "a" "t" "c" "t" "c" "c" "t" "g" "c" "a" "g"
[163] "g" "t" "c" "t" "a" "g" "t" "c" "a" "g" "a" "g" "c" "c" "t" "c" "c" "t"
[181] "a" "c" "a" "t" "a" "a" "t" "g" "g" "a" "t" "a" "c" "a" "a" "c" "t" "a"
[199] "t" "a"


Note that ambiguities are indicated with an “n”. The sequence evaluation will involve counting the number of ambiguitites at each index position. The expectation is that initially - first 25 or so bases - will have a large number of ambiguities, falling to near zero at position 50. This is the run length required to get the primer annealed and incoporating nucleotides. Next will follow 800-1200 positions with near zero ambiguity count. How long exactly is a function of the sequencing quality. Towards the end of the run the ambiguities begin to rise as the polymerase loses energy. Finally the ambiguity count will fall as the reads terminate.

Create a vector nbsum that will tally the count of ambiguities at a given index. Then process through each sequence and count, at each index, the number of ambiguities. The total count of ambiguities is entered into nbsum at the corresponding index position.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
for( i in 1:length(s.b)){
for( j in 1:length( getSequence(s.b[[i]][[1]]))){
if( getSequence(s.b[[i]])[[1]][j] == "n") nbsum[j] <- nbsum[j] + 1
}
}

> nbsum[1:100]
[1] 167 168 168 168 168 166 163 164 160 153 149 142 131 150 135 125 120 111
[19] 99 93 79 80 59 51 61 52 48 38 26 20 17 17 22 20 18 14
[37] 16 15 13 11 14 16 23 21 13 12 6 7 5 6 9 5 3 3
[55] 1 4 3 1 1 3 3 3 2 0 1 0 0 0 0 1 1 1
[73] 5 5 12 14 21 25 24 28 29 31 21 20 8 8 7 10 4 2
[91] 2 3 5 1 3 0 3 1 1 0
>


x <- 1:1200
plot(nbsum[x])

Overlay the reverse reads in red.

1
2
3
4
5
6
7
8
9
10
nfsum <- vector( mode="integer", length=2000)

for( i in 1:length(s.f)){
for( j in 1:length( getSequence(s.f[[i]][[1]]))){
if( getSequence(s.f[[i]])[[1]][j] == "n") nfsum[j] <- nfsum[j] + 1
}
}

points(nfsum[x], col="red")

I have created a shiny app that implements the above code. Download it here.

Installation

Edit your channels.scm file to include the labsolns channel

Once edited:


$guix pull
$guix package -i seqeval
$source $HOME/.guix-profile/etc/profile

##run the bash script

$ seqeval.sh

Systems

A systems approach involves the integration of multiple independant commercial and custom software products to work in unison towards a common goal. A systems approach allows flexibility by allowing for the upgrade or discard and replacement of individual components as requirements change.

Advantages

  • Flexible; can evolve as process evolves
  • Best of breed components can be used
  • Portability of knowledge (Spotfire, R, SQL)
  • Adaptable to containerization

Disadvantages

  • Components on different upgrade cycles
  • Components use different technologies with scattered expertise
  • Configuration challenges: missing libraries, auxilliary software
  • May depend on external network connectivity
  • User training can be challenging
  • Integration can be challenging

References

Microservices as innovation enablers best practices == common practices
Split the monolith
Trulia switches to “Islands”
A contrarian’s (with vested interests) view

Case study of monolith implementation: Why Doctors hate their computers Discusses feature creep and the “Tar Pit”

Proprietary IT give big companies their edge.

Rob Brigham, Amazon AWS senior manager for product management: “Now, don’t get me wrong. It was architected in multiple tiers, and those tiers had many components in them. But they’re all very tightly coupled together, where they behaved like one big monolith. Now, a lot of startups, and even projects inside of big companies, start out this way. They take a monolith-first approach, because it’s very quick, to get moving quickly. But over time, as that project matures, as you add more developers on it, as it grows and the code base gets larger and the architecture gets more complex, that monolith is going to add overhead into your process, and that software development lifecycle is going to begin to slow down.”

When computational pipelines go ‘clank’

Next>> Features

Targets

For a definition of target see the layouts page. Targets are primarily used to annotate data and assist with merging LIMS*Nucleus data with data from other systems. Defining targets is optional and if not done, generic “Target1”, “Target2” labels will be used in output. Using targets requires three steps:

  1. Register targets inividually or (administrator) import in bulk.
  2. Define target layouts
  3. Apply layouts to plate sets

Defining layouts only makes sense when creating assay plate sets. Apply the target layout during the reformating step.

There are two methods of importing targets:

Bulk import by an administrator

Under the admin menu item select “Bulk target import”. A file chooser dialog will appear. Choose an import file with the format described below:

1
2
3
4
5
6
7
8
9
project	target 	description	accession
1 muCD71 Mouse transferrin receptor FHD8SU29
1 huCD71 Human transferrin receptor JDHSU789
1 cynoCD71 Monkey transferrin receptor KSIOW8H3
1 BSA Bovine serum albumin KEUI87YH
2 Lysozyme Lysozyme KDJFG98D
2 GAPDH Glyceraldehyde Phosphate Dehydrogenase KFIIOD09
2 ICAM4 ICAM 4 integrin KL0OIE7U
2 IL21R IL21 receptor KOI89IUY

Here is an example target import file: targets200.txt

Column header spelling, capitalization, and order are critical. Indicate the project to which the target should be associated in column one. Import will fail if the project id is not in the database. For targets that should be available to all projects, place “NULL” (no quotes) in the first column. Only administrators can designate target project id as NULL during bulk import. Note that currently there is no opportunity to update an accession at a later time should it be blank upon import.

One at a time import by users

Under the menu bar Targets/Add New Target will show all targets. At the top use the tool button to navigate to the add target page:

Fill in the form. Press Submit. The target is associated with the current project and is only available within that project. Once targets have been registered, they can be used in a target layout.

Algorithms

Processing steps

  1. Import data, setting all negative values to 0.

    On a plate by plate basis:

  2. Calculate the average of all wells labeled “blank” to obtain plate specific backgound signal

  3. Subract backgound from all signals to obtain background subtracted values (bkgrnd_sub below) which are used in all further calculations

  4. Set all background subtracted values that are less than zero to zero

(4.5 For layouts utilizing duplicates (2S4T, 4S2T), average the duplicates)

  1. Calculate norm, norm_pos, p_enhance as described below

Background subtraction, normalization

Upon data import, raw values are stored and processed as described above, then the calculations below are performed to yield additional columns of stored data.

column Description
raw imported raw data
bkgrnd_sub mean of all wells annotated “blank” subtracted from each raw value;
norm all values normalized to the maximum of the background subtracted values annotated as “unknown”;
norm_pos all values normalized to the mean of the background subtracted values annotated as “positive”;
p_enhance Percent enhancement over the positive control;
100*(

Hit identification

Algorithm

Label Hit threshold
mean(neg) + 3SD
mean(neg) + 2SD
>0% enhanced
Top N Highest N responses from unknowns



### References

Sittampalam GS, Coussens NP, Brimacombe K, et al., editors. Assay Guidance Manual Internet. Bethesda (MD): Eli Lilly & Company and the National Center for Advancing Translational Sciences; 2004-.

Brian P. Kelley, 1 Mitchell R. Lunn, 1 David E. Root, Stephen P. Flaherty, Allison M. Martino, and Brent R. Stockwell; A Flexible Data Analysis Tool for Chemical Genetic Screens, Chemistry & Biology 11:1495–1503, November, 2004