Scientific Research (Biochemistry) File 1
DIPEPTIDE SEQUENCE DETERMINATION
I.
INTRODUCTION
Purification of the sample protein is the first step employed when determining the detailed
structure and biological function of that given protein. As what was discussed in the prev ious
biochemistry laboratory reports, a protein is composed of different levels of architecture. One of
the most crucial parts of determining a protein’s activity is its primary structure or the sequence
of amino acid residues. Every protein has a unique number and sequence of amino acids in its
structure, which affects how the molecule came up with its three-dimensional conformation.
(Nelson & Cox, 2007). Therefore, knowing the sequence of a peptide chain is an important
technique in understanding the relationships between the chemical properties of the protein
and its biological function (Boyer, 2000).
There are two common techniques used in sequencing peptide chains. These are developed by
Swedish biochemist, Pehr Victor Edman and British biochemist Frederick Sanger, which are
Edman degradation and Sanger’s method respectively. Sanger had first determined the
complete protein sequence of bovine insulin in 1953, which awarded him the Nobel Prize for
Chemistry in 1958 (Dubey, n.d.).
In this exercise, the amino acid sequence of an unknown dipeptide sample is determined using
Sanger’s method. This technique identifies terminal amino acid residues in peptides and protein
by employing 1-fluoro-2,4-dinitrobenzene (FDNB). The sequence of the unknown dipeptide
sample was determined by identifying its amino acid composition through hydrolysis followed by
analysis using paper chromatography, and recognizing the N-terminal of the dipeptide sample
by analyzing the 2,4-DNP-amino acid derivative formed from Sanger’s method using thin layer
chromatography.
II.
METHODOLOGY
A. Total Hydrolysis of the Dipeptide
The unknown dipeptide sample and the standard amino acid samples was prepared prior to the
experiment. One mL of diethyl ether was added to the sample, which was spotted on a
chromatographic paper along with the standards. The paper was developed in a
chromatographic chamber previously equilibrated with the solvent, 4:1:1 butanol-acetic acidwater. The chromatogram was removed from the chamber after the solvent had reached about
a centimeter from the top of the chromatography paper. The solvent front was marked using a
pencil. The paper was then air-dried. The amino acids were located on the chromatogram by
visualization through reaction with ninhydrin wherein the paper was sprayed with ninhydrin and
was heated using a blow-dryer. The amino acids appeared as blue -violet spots. The spots on
the paper were marked and the results were measured and interpreted.
B. Derivatization
The DNP derivatives of the sample dipeptide as well as the standards were prepared. Five
milligrams of each of the amino acid standards and dipeptide were mixed with 1.00 mL
saturated NaHCO 3, 1.00 mL water and 1.00 mL of 5% ethanolic solution of dinitrofluorobenzene
(FBNB) in a small test tube. The pH was adjusted to about 8 using solid NaHCO 3 and was
monitored using pH paper. The sample digestion proceeded after a marble was placed on top of
each test tube and heated in a water bath at 80°C for about 15 minutes.
The DNP derivative of each amino acid and dipeptide sample was hydrolyzed during the said
process. It was cooled and shook with 5.00 mL portions of ether. To remove the excess ethanol
and excess FBNB, the aqueous layer was transferred to a new test tube us ing pasteur pipet.
This leaved an aqueous suspension of the DNP-dipeptide. About 0.1 mL of 6 N HCl was added
to each of the solution to adjust its pH to approx imately pH 1. The DNP-amino acid was
extracted from this hydrolysate using 2.00 mL portions of ether. The ether layers were pooled
and the samples were evaporated in a hot water bath.
The reaction mixtures were cooled and extracted in a 5.0 mL beaker. The bottom aqueous layer
was separated using a pasteur pipet. The extraction was repeated thrice until the yellow color
of the solution had slightly faded. The pH of the aqueous layer was adjusted to about 1 using 6
N HCl solution. The solution was extracted with 2.00 mL portions of ether thrice. The ether
extracts were pooled in 50 mL beaker. Most of the solvents were evaporated in a warm hot
plate.
C. Thin Layer Chromatography
The samples were analyzed using Thin Layer Chromatography. About 60 mL of 94:5:1 CH2Cl2MeOH-CH3COOH solvent system was prepared. The solution was mixed well and poured in a
filter paper lined chromatographic tank. It was allowed to stand for 30 minutes for equilibration.
The sample and the standards were spotted using a fine glass capillary tube to a previously
prepared TLC plate. The chromatogram was developed inside the chamber. The DNP-amino
acids appeared as bright yellow spots on the TLC plate.
III.
RESULTS AND DISCUSSIONS
A dipeptide sample was analyzed in the laboratory and its sequence was determined. First, the
amino acid composition of the dipeptide sample was identified by analyzing the hydrolyzed
sample through paper chromatography. The dipeptide sample was exposed to extreme
conditions to break the strong peptide bond. 0.1 mL of 6N HCl solution was added to the
dipeptide sample, which was incubated inside an oven overnight at 110°C. The hydrolyzed
dipeptide sample and the standard amino acids were spotted in a chromatography paper, which
was developed in the chromatographic chamber and visualized using ninhydrin. This reagent
reacts in the presence of amino acids giving derivatives that have a strong absorbance at 570
nm. The paper chromatogram is shown on Figure 6.1 and from this figure, the distances
travelled and retention factors (Rf) by the spotted samples were determined and are
summarized in Table 6.1
Figure 6.1. Paper chromatogram of the amino acid standards and dipeptide sample (S 2).
Table 6.1. Distances travelled by the standard amino acids and the dipeptide compositions in
the chromatographic paper with their corresponding Rf values.
Distance Travelled,
Amino Acid
Rf
mm
Phenylalanine-
Glycine-
Valine-
Standards
Serine-
Glutamic acid-
Threonine-
Unknown-
Sample
Unknown-
Solvent
Solvent Front
82
1
The result of paper chromatography suggests that the dipeptide sample is composed of
Threonine and either one of Pheny lalanine or Valine. Upon comparison with Figure 6.1, it
suggests that the other component is Valine since the spot are more likely to be on the same
height. These inferences are made based on the comparable Rf values between the amino acid
compositions of the sample and the standard amino acids. The structures of the amino acid
components of the unknown dipeptide chain are shown on Figure 6.2.
Figure 6.2. Structures of Threonine (left) and Valine (right).
To identify the sequence of the dipeptide sample, it was analyzed using Sanger N-Terminal
Analysis. It was developed by Frederick Sanger, which allowed him to determine the amino acid
residue on the N-terminal end of a polypeptide chain by using the reagent dinitrofluorobenzene
(FDNB) (Protein Primary Structure, 2015). The FDNB reagent reacts in basic solution v ia
nucleophilic aromatic substitution reaction (SN Ar) with the free amino group of the N-terminal
residue of the polypeptide chain. In addition, it is important to make sure that the pH of the
solution containing the peptide sample is around 8, to make sure that the free amino group of
the peptide chain is uncharged making the nitrogen nucleophilic enough for SN Ar reaction to
proceed (Solomons & Fryle, 2012).
The pH is adjusted using NaHCO 3. Additional sodium bicarbonate may be required during the
reaction because of the production of hydrofluoric acid in the solution. NaHCO 3, weak acid was
used because using strong acid like NaOH would favor the production of 2,4-dinitrophenol as a
side product from the reaction between the used reagent, FDNB and –OH ions (Sequence
Determination of a Dipeptide, 2007).
-
OH
Figure 6.3. Reaction of DNP in the presence of strong base.
The reaction of the polypeptide sample with FDNB is shown on Figure 6.4.
Figure 6.4. General reaction for Sanger’s method of the determining the sequence of a peptide
or protein sample.
As the “labeled polypeptide” or the DNP-peptide derivative is formed, it is hydrolyzed in acid.
Upon hydrolysis the uncharged “labeled N-terminal amino acid” or the DNP-amino acid
derivative and the charged amino acid residues are produced. The charged amino acid residues
remained in the aqueous solution while the DNP-amino acid derivative is extracted with ether.
The separated N-terminal DNP-amino acid was identified using thin layer chromatography.
Amino acid standards were also made to react with FDNB to form its DNP-amino acid derivative,
which is also analyzed using thin layer chromatography, used for comparison to identify the
unknown sample. Figure 6.5 shows the thin layer chromatogram where the distance travelled
by the DNP-amino acid standards and DNP-amino acid from the dipeptide sample with their
corresponding Rf values are deduced. On the other hand, the distances travelled as well as the
Rf values were summarized in the Table 6.2.
Figure 6.5. Thin-layer chromatogram of the DNP derivatives of the standard amino acids and
sample.
Table 6.2. Distances travelled by the standard DNP-amino acid derivatives and the N-terminal
DNP-amino acid complex with their corresponding Rf values using thin layer
chromatography.
Distance Travelled,
Amino Acid
Rf
mm
DNP-Phe-
DNP-Gly-
DNP-Val-
Standards
DNP-Ser-
DNP-Glu-
DNP-Thr-
Sample
DNP-Unknown-
Solvent
Solvent Front
77
1
Using the result of paper chromatography and thin layer chromatography the sequence of the
two unknown dipeptide samples can now be deduced. Since DNP-unknown has an Rf value
comparable to threonine it suggests that the N-terminal residue of dipeptide sample is
threonine and the sequence is now Thr-Val. The structure of the dipeptide sample is shown on
Figure 6.6.
Figure 6.6. Structure of the dipeptide sample.
For further illustration of the reaction, consider a peptide chain whose N-terminals are Lys, His
and Tyr. The structure of the product with Sanger’s reagent (FDNB) is shown on the following
figures:
Figure 6.7. Structure of the DNP-lysine complex.
Figure 6.8. Structure of the DNP-histidine complex.
Figure 6.9. Structure of the DNP-tyrosine complex.
When amino acid is not the N-terminal, it exists in the solution in its free form after the acid
hydrolysis. Therefore if Lys, His and Tyr are part of either the carboxy or internal residue of the
peptide chain, they will ex ist in the solution in their charged form. Moreover these amino acids
remained soluble in the aqueous phase.
Given an unknown pentapeptide, which has been analyzed using the same method in this
experiment but no DNP-amino acid was obtained, it is possible that the N-terminal DNP-amino
acid produced had been destroyed due to long acid hydrolysis time. These N-terminal DNPamino acid could be DNP-tryptophan, DNP-glycine or DNP-proline, which are known to be
unstable in different extent, and are prone to dissociation (Maramorosch & Koprowski, 2014).
Enzymes capable of hydrolyzing peptide chains are classified as proteases, which digest long
protein chains into shorter fragments. These enzymes are used in protein sequence analysis
due to their specificity in cleaving peptide bonds. Aminopeptidase and carboxypeptidase cleaves
terminal amino acid in the peptide chain. Aminopeptidase cleaves the peptide bond involv ing
carboxyl side of N-terminal amino acid, while carboxypeptidase cleaves the peptide bond
involv ing amino side of C-terminal amino acid. Internal cleavage of a peptide chain is done by
Trypsin, Chymotrypsin and Thermolysin. Trypsin cleaves carboxyl side of basic amino acids like
arginine and lysine. Chymotrypsin cleaves the carboxyl side of aromatic amino acids
(phenylalanine, tryptophan, and tyrosine) and leucine. Lastly, thermolysin can cleave the
peptide bonds involv ing the amino side of aromatic amino acids like phenylalanine, tryptophan
and tyrosine, and amino acids with bulky side chains like leucine, isoleucine and valine.
These methods are only effective in identify ing the sequence of several dipeptide samples.
There are many limitations in the process described above especially when one is working with
longer polypeptide chains. Using Sanger’s N-terminal analysis is not appropriate in identifying
the sequence of peptide samples having more than two amino acid components, since as FDNB
reacts with the peptide sample leading to the formation of N-terminal DNP-amino acid complex,
the protein is already hydrolyzed in constituent amino acids. Therefore, the process could not
be repeated in cycle using the same sample. Moreover, FDNB also can also react with the
amino group present in the side chains of the amino acids, which can give erroneous results
(Dubey, n.d.).
In the method above, the dipeptide was hydrolyzed to identify its component amino acids,
however there are other reaction that may occur in the released amino acid. These reactions
are shown on the following figures:
Figure 6.10. Dehydration of Ser.
Figure 6.11. Oxidation of Met.
Figure 6.12. Oxidation of Cys.
Figure 6.13. Oxidation of Cystine.
Figure 6.14. Hydrolysis of Gln.
Figure 6.15. Hydrolysis of Asn.
IV.
SAMPLE CALCULATIONS
1. Determination of Rf values for both standards and samples
V.
SUMMARY AND CONCLUSIONS
Sanger’s method of N-terminal analysis of peptide was employed in the determination of the
sequence of two unknown dipeptide samples. The first step involves the hydrolysis of the
dipeptide sample to free its amino acid components in the solution while the second step
involves analysis through paper chromatography and thin layer chromatography. The Rf values
were determined for both the standards and the sample dipeptide. The identity of the amino
acid components were determined by comparison of the sample’s Rf value against the
standards.
It was found out that the sample dipeptide is composed of Valine and Threonine. Reacting the
peptide sample with FDNB would determine the N-terminal residue of the peptide which leads
to the identification of the sequence itself. Upon analysis, the dipeptide sample is identified as
Thr-Val.
VI.
1.
2.
3.
4.
5.
6.
7.
REFERENCES
Boyer, R. F. (2012). Biochemistry Laboratory Modern Theory and Techniques. (2 nd Ed).
Pearson Education, Inc.
Dubey, V. K. (n.d.). Lecture 18: Protein Sequencing. Proteomics & Genomics.
Lehninger, A. L., Nelson, D. L., & Cox, M. M. (2007). Lehninger principles of
biochemistry. New York: Worth Publishers.
Maramorosch, K. & Koprowski, H. (2014). Methods in Virology. Science. Academic Press.
Vol 3.
Protein
Primary
Structure
Sequencing
Methods.
(n.d.).
Sequence
Determination
of
a
Dipeptide.
(2007).
Solomons, T. W., Fryhle, C. & Snyder, S. (2012). Organic Chemistry. 11th Ed. Wiley
Global Education.