May 01, 2024, 02:11:53 PM
Forum Rules: Read This Before Posting


Topic: signal processing in forensic DNA analysis  (Read 2794 times)

0 Members and 1 Guest are viewing this topic.

Offline Babcock_Hall

  • Chemist
  • Sr. Member
  • *
  • Posts: 5615
  • Mole Snacks: +322/-22
signal processing in forensic DNA analysis
« on: January 21, 2014, 12:46:14 PM »
Modern DNA forensics makes use of capillary electrophoresis.  DNA forensic analysts say that when they do a case review, their review is not complete until they see the electronic data files.  The EDFs are the "raw data."  Butler wrote, "GeneScan soft- ware contains six different screens that may be used as part of data analysis and evaluation: processed data (color-separated), size standard curve, electrophoresis history, sample information, raw data (no color separa- tion), and an analysis log file."  I would like to know more about the process of converting raw data into processed data.  One question I have is how heavily the data can be processed in producing the final electropherogram.  Butler also wrote, "These macros can be designed to filter out stutter peaks (see [59]) that may interfere with sample interpretation."  Stutter is an artifact of the PCR process, but I can see that there might be some misuse of such a function.
http://www.cstl.nist.gov/div831/strbase/pub_pres/Butler2004a.pdf

Offline rjb

  • Full Member
  • ****
  • Posts: 124
  • Mole Snacks: +17/-0
Re: signal processing in forensic DNA analysis
« Reply #1 on: March 21, 2014, 07:54:04 PM »
BH,

I'm sorry I didn't spot this earlier...

Firstly, it probably worth noting that Genescan has been superseded by Genemapper/IDX/True Allele (Yuk!)/FSS-I3/DNA INSIGHT (never released commercially outside of 1 UK FSci provider) etc. but nevertheless the principles of data processing (at least in Applied bio software) remains similar.

From memory (and after a few glasses of red!), raw data is in a proprietary format (.fsa) and once imported into the software, a 'matrix' can be applied which corrects for spectral overlap (if this hasn't taken place at the CE stage) - this lowers the impact of so called pull-up. The data will then go through a second stage at which point analysis parameters can be applied which allow calculation of allele size against the internal size standard, calculation of allele designation against allelic ladders and various jiggery pokery to apply peak smoothing etc. Genemapper includes a peak filter to remove (i.e. not label) sub-threshold peaks and stutters in both -4 position and +4 positions according to user parameters allowing a definable % threshold i.e 15% of parent allele. A similar functionality existed in Genescan but was based on macros.

In regards to misuse, the software does not delete the data, it will only un-label the peaks in questions, hence data integrity is not an issue. Stutters can be re-labelled by the user. In my view a bigger area of concern is data smoothing (which I know for a fact has been misused) and misuse of CE injection parameters (i.e. applying longer injection times) to bring low alleles above threshold - very bad practice in my view.

R   

Offline Babcock_Hall

  • Chemist
  • Sr. Member
  • *
  • Posts: 5615
  • Mole Snacks: +322/-22
Re: signal processing in forensic DNA analysis
« Reply #2 on: March 22, 2014, 09:35:54 AM »
rjb,

Thank you; that is very helpful.  In one of the egrams I am studying, the majority of alleles are below 50 RFU.  I was unaware of the issue surrounding CE injection times.  If I understand your comment correctly, it supports what indepedent forensic scientists say about case reviews, namely that looking over the electronic data files is an essential component of a complete case review.

Offline rjb

  • Full Member
  • ****
  • Posts: 124
  • Mole Snacks: +17/-0
Re: signal processing in forensic DNA analysis
« Reply #3 on: March 24, 2014, 06:18:44 PM »
BH,

As you say, data/log files are certainly a potential area to consider when carrying out full case reviews. Its not so much of an issue this side of the pond where parameters tend to be set in stone once validated, but in the US, (in my experience) there does tend to be a little more fudging.

Low level data is certainly a good area for consideration and is notoriously unreliable. It is probably worth noting that just because an allele is above threshold it does not guarantee the reliability of that allele call. It is not unheard of for an otherwise 'perfect' "homozygote locus" at 140rfu when re PCR'd to reveal the presence of a second allele, hence the reason for a higher designation threshold for homozygotes. From memory, sub 150 rfu data was generally considered less reliable.

Anyhow, if you need someone to bounce some ideas off, please feel free to get in contact.

Kind Regards

R


Sponsored Links