AU686408B2 – Plant genes for sensitivity to ethylene and pathogens
– Google Patents
AU686408B2 – Plant genes for sensitivity to ethylene and pathogens
– Google Patents
Plant genes for sensitivity to ethylene and pathogens
Download PDF
Info
Publication number
AU686408B2
AU686408B2
AU28650/95A
AU2865095A
AU686408B2
AU 686408 B2
AU686408 B2
AU 686408B2
AU 28650/95 A
AU28650/95 A
AU 28650/95A
AU 2865095 A
AU2865095 A
AU 2865095A
AU 686408 B2
AU686408 B2
AU 686408B2
Authority
AU
Australia
Prior art keywords
leu
xaa xaa
gly
ser
ala
Prior art date
1994-06-17
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU28650/95A
Other versions
AU2865095A
(en
Inventor
Joseph Ecker
Anne Lehman
Gregg Roman
Madge Rothenberg
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Pennsylvania Penn
Original Assignee
University of Pennsylvania Penn
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
1994-06-17
Filing date
1995-06-15
Publication date
1998-02-05
1995-06-15
Application filed by University of Pennsylvania Penn
filed
Critical
University of Pennsylvania Penn
1996-01-15
Publication of AU2865095A
publication
Critical
patent/AU2865095A/en
1998-02-05
Application granted
granted
Critical
1998-02-05
Publication of AU686408B2
publication
Critical
patent/AU686408B2/en
2015-06-15
Anticipated expiration
legal-status
Critical
Status
Ceased
legal-status
Critical
Current
Links
Espacenet
Global Dossier
Discuss
Classifications
C—CHEMISTRY; METALLURGY
C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
C12N15/09—Recombinant DNA-technology
C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
C12N15/8241—Phenotypically and genetically modified plants via recombinant DNA technology
C12N15/8261—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
C12N15/8271—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance
C12N15/8279—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance for biotic stress resistance, pathogen resistance, disease resistance
C12N15/8281—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance for biotic stress resistance, pathogen resistance, disease resistance for bacterial resistance
C—CHEMISTRY; METALLURGY
C07—ORGANIC CHEMISTRY
C07K—PEPTIDES
C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
C07K14/415—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from plants
C—CHEMISTRY; METALLURGY
C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
C12N15/09—Recombinant DNA-technology
C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
C12N15/8241—Phenotypically and genetically modified plants via recombinant DNA technology
C12N15/8242—Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits
C12N15/8243—Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits involving biosynthetic or metabolic pathways, i.e. metabolic engineering, e.g. nicotine, caffeine
C12N15/8249—Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits involving biosynthetic or metabolic pathways, i.e. metabolic engineering, e.g. nicotine, caffeine involving ethylene biosynthesis, senescence or fruit development, e.g. modified tomato ripening, cut flower shelf-life
C—CHEMISTRY; METALLURGY
C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
C12N15/09—Recombinant DNA-technology
C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
C12N15/8241—Phenotypically and genetically modified plants via recombinant DNA technology
C12N15/8261—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
C12N15/8271—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance
C12N15/8279—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance for biotic stress resistance, pathogen resistance, disease resistance
Description
WO 95/35318 PCT/US95/07744 PLANT GENES FOR SENSITIVITY TO ETHYLENE AND PATHOGENS REFERENCE TO RELATED APPLICATIONS This application is a continuation-in-part of U.S. application Serial No. 08/003,311, filed January 12, 1993, a continuation-in-part of U.S. application Serial No.
928,464, filed August 10, 1992; this application is also a continuation-in-part of U.S. application Serial No.
08/171,207, filed December 21, 1993, which is a continuation of U.S. application Serial No. 899,262, filed June 16, 1992, now abandoned; the disclosures of which are hereby incorporated in their entirety.
REFERENCE TO GOVERNMENT GRANTS This work was supported in part by research grants from the National Institutes of Health GM-26379 and National Science Foundation grant IBN-92-05342. The United States Government may have certain rights in this invention.
BACKGROUND OF THE INVENTION Ethylene, a gaseous plant hormone, is involved in the regulation of a number of plant processes ranging from growth and development to fruit ripening. As in animal systems, response of plants to disease not only involves static processes, but also involves inducible defense mechanisms. One of the earliest detectable event to occur during plant-pathogen interaction is a rapid increase in ethylene biosynthesis. Ethylene biosynthesis, in response to pathogen invasion, correlates with increased defense WO 95/35318 PCT/US95/07744 2 mechanisms, chlorosis, senescence and abscission. The molecular mechanisms underlying operation of ethylene action, however, are unknown. Nonetheless, ethylent produced in response to biological stress is known to regulate the rate of transcription of specific plant genes.
A variety of biological stresses can induce ethylene production in plants including wounding, bacterial, viral or fungal infection as can treatment with elicitors, such as glycopeptide elicitor preparations (prepared by chemical extraction from fungal pathogen cells). Researchers have found, for example, that treatment of plants with ethylene generally increases the level of many pathogen-inducible «defense proteins», including P-1,3-glucanase, chitinase, L-phenylalanine ammonia lyase, and hydroxyproline-rich glycoproteins. The genes for these proteins can be transcriptionally activated by ethylene and their expression can be blocked by inhibitors of ethylene biosynthesis. Researchers have also characterized a normal plant response to the production or administration of ethylene, as a so-called «triple response». The triple response involves inhibition of root and stem elongation, radial swelling of the stem and absence of normal geotropic response (diageotropism).
Ethylene is one of five well-established plant hormones. It mediates a diverse array of plant responses including fruit ripening, leaf abscission and flower senescence.
The pathway for ethylene biosynthesis has been established (Figure Methionine is converted to ethylene with S-adenylmethionine (SAM) and l-aminocyclopropane-l-carboxylic acid (ACC) as intermediates. The production of ACC from SAM is catalyzed by the enzyme ACC synthase. Physiological analysis has suggested that this is the key regulatory step in the pathway, see Kende, Plant Physiol. 1989, 91, 1-4. This enzyme has been cloned from several sources, see Sato et al., PNAS, (USA) 1989, 86, 6621; Van Der Straeten et al., WO 95/35318 PC1TUS95/07744 3 PNAS, (USA) 1990, 87, 4859-4863; Nakajima et al., Plant Cell Physiol. 1990, 29, 989. The conversion of ACC to ethylene is catalyzed by ethylene forming enzyme (EFE), which has been recently cloned (Spanu et al., EMBO J 1991, 10, 2007. Aminoethoxy-vinylglycine (AVG) and a-aminoisobutyric acid (AIB) have been shown to inhibit ACC synthase and EFE respectively. Ethylene binding is inhibited non-‘competitively by silver, and competitively by several compounds, the most effective of which is trans-cyclooctane. ACC synthase is encoded by a highly divergent gene family in tomato and Arabidopsis (Theologis, Cell 70:181 (1992)). ACC oxidase, which converts ACC to ethylene, is expressed constitutively in most tissues (Yang et al., Ann. Rev. Plant Physiol. 1984, 35, 155), but is induced during fruit ripening (Gray et al. Cell 1993 72, 427). It has been shown to be a dioxygenase belonging to the Fe2+/ascorbate oxidase superfamily (McGarvey et al., Plant Physiol. 1992, 98, 554).
Etiolated dicotyledonous seedlings are normally highly elongated and display an apical arch-shaped structure at the terminal part of the shoot axis; the apical hook. The effect of ethylene on dark grown seedlings, the triple response, was first described in peas by Neljubow in 1901, Neljubow, Pflanzen Beih. Bot.
Zentralb., 1901, 10, 128. In Arabidopsis, a typical triple response consists of a shortening and radial swelling of the hypocotyl, an inhibition of root elongation and an exaggeration of the curvature of the apical hook (Figures 7 and 16). Etiolated morphology is dramatically altered by stress conditions which induce ethylene production the ethylene-induced «triple response» may provide the seedling with additional strength required for penetration of compact soils, see Harpham et al., Annals of Bot., 1991, 68, 55. Ethylene may also be important for other stress responses. ACC synthase gene expression and ethylene production is induced by many types of biological and physical stress, such as wounding and pathogen infection, WO 95/35318 PCT/US95/07744 4 see Boller, in The Plant Hormone Ethylene, A.K. Mattoo and J.C. Suttle eds., 293-314, 1991, CRC Press, Inc. Boca Raton and Yu, Y. et al., Plant Phys., 1979, 63,589, Abeles et al. 1992 Second Edition San Diego, CA Academic Press; and Gray et al. Plant Mol Biol. 1992 19, 69.
A number of researchers have identified the interaction between Arabidopsis thaliana and Pseudomonas syringae bacteria; Whalen et al., «Identification of Pseudomonas syringae Pathogens of Arabidopsis and a Bacterial Locus Determining Avirulence on Both Arabidopsis and Soybean», The Plant Cell 1991, 3, 49, Dong et al., «Induction of Arabidopsis Defense Genes by Virulent and Avirulent Pseudomonas syringae Strains and by a Cloned Avirulence Gene», The Plant Cell 1991, 3, 61, and Debener et al., «Identification and Molecular Mapping of a Single Arabidopsis thaliana Locus Determining Resistance to a Phytopathogenic Pseudomonas syringae Isolate», The Plant Journal 1991, 289. P. syringae pv. tomato (Pst) strains are pathogenic on Arabidopsis. A single bacterial gene, avrRpt2, was isolated that controls pathogen avirulence on specific Arabidopsis host genotype Col-0.
Bent, et al., «Disease Development in Ethylene-Insensitive Arabidopsis thaliana Infected with Virulent and Avirulent Pseudomonas and Xanthomonas Pathogens», Molecular Plant-Microbe Interactions 1992, 372; Agrios, Plant Pathology 1988, 126, Academic Press, San Diego; and Mussel, «Tolerance to Disease», page 40, in Plant Disease: An Advanced Treatise, Volume Horsfall, J.G. and Cowling, eds., 1980, Academic Press, New York, establish the art recognized definitions of tolerance, susceptibility, and resistance. Tolerance is defined for purposes of the present invention as growth of a pathogen in a plant where the plant does not sustain damage. Resistance is defined as the inability of a pathogen to grow in a plant and no damage to the plant results. Susceptibility is indicated by pathogen growth with plant damage.
WO 95/35318 PCT/US95/07744 5 Regardless of the molecular mechanisms involved, the normal ethylene response of a plant to pathogen invasion has been thought to have a cause and effect relationship in the ability of a plant to fight off plant pathogens. Plants insensitive in any fashion to ethylene were believed to be incapable of eliciting a proper defense response to pathogen invasion, and thus unable to initiate proper defense mechanisms. As such, ethylene insensitive plants were thought to be less disease tolerant.
The induction of disease responses in plants requires recognition of pathogens or pathogen-induced symptoms. In a large number of plant-pathogen interactions, successful resistance is observed when the plant has a resistance gene with functional specificity for pathogens that carry a particular avirulence gene. If the plant and pathogen carry resistance and avirulence genes with matched specificity, disease spread is curtailed and a hypersensitive response involving localized cell death and physical isolation of the pathogen typically occurs. In the absence of matched resistance and avirulence genes, colonization and tissue damage proceed past the site of initial infection and disease is observed.
A better understanding of plant pathogen tolerance is needed. Also needed is the development of methods for improving the tolerance of plants to pathogens, as well as the development of easy and efficient methods for identifying pathogen tolerant plants.
Genetic and molecular characterization of several gene loci and protein products is set forth in the present invention. The results will reveal interactions among modulatory components of the ethylene action pathway and provide insight into how plant hormones function. Thus, the quantity, quality and longevity of food, such as fruits and vegetables, and other plant products such as flowers, will be improved thereby providing more products for market in both developed and underdeveloped countries.
WO 95/35318 PCT/US95/07744 6- SUMMARY OF THE INVENTION The present invention is directed to nucleic acid sequences for ethylene insensitive, EIN loci and corresponding amino acid sequences. Several ein wild type sequences, mutations, amino acid sequences, and protein products are included within the scope of the present invention. The nucleic acid sequences set forth in SEQUENCE ID NUMBERS 1 and 2 for ein2; 4, 5, 7, 9, and 11 for ein3 and eill, eil2, eil3; as well as amino acid sequences set forth in SEQUENCE ID NUMBERS 3 for ein2; 6, 8, 10, 12, and 13 for ein3 and eill, eil2, eil3; are particular embodiments of the present invention.
The present invention is also directed to nucleic acid sequences for hooklessl, HLS1, alleles and amino acid sequences. Wild type and mutated nucleic acid sequences, amino acid sequences and proteins are included within the scope of the present invention. The nucleic acid sequences of hlsl are set forth in SEQUENCE ID NUMBERS: 14 and 15; the amino acid sequences are set forth in SEQUENCE ID NUMBER: 16.
These and other aspects of the invention will become more apparent from the following detailed description when taken in conjunction with the following figures.
BRIEF DESCRIPTION OF THE FIGURES Figure 1 displays the EIN2 region on chromosome of Arabidopsis thaliana. O represents the left end probe, o represents the right end probe, a length of 100 kb is represented in the legend.
Figure 2 is a genomic Southern blot. A polymorphism was detected in ein2-12 by hybridization with g3715. The g3715 cosmid was hybridized to a genomic Southern blot containing several alleles of ein2. In ein2- 12 EcoR I digested genomic DNA, two bands were missing, 1.2 kb and 4.3 kb; and a new 5.5 kb fragment was detected. The DNA from the ein2 alleles was purified according to Chang et al. Proc. Natl. Acad. Sci USA 1988 85, 6857. 5 pg of WO 95/35318 PCT/US95/07744 7 EcoR I digested DNA was separated on a 0.8% agarose gel and blotted to hybond N (Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd ed., 1989, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, Amersham, Arlington Heights, IL). All hybridizations were done using random hexamer labeled DNAs (Feinberg and Volgelstein, Anal. Biochem 1984 137, 266). Filters were prehybridized for at least 2 hours in 0.5 M sodium phosphate pH 7.2, 7% sodium dodecyl sulfate, and 1% BSA at 600 C. Hybridization of a minimum of 15 hours was in a solution of 0.5 M sodium phosphate pH 7.2, 7% sodium dodecyl sulfate, and 1% BSA at 600 C. Hybridization filters were washed and autoradiographed (Sambrook et al. 1989).
Figure 3 is a diagram c- the polymorphism in ein2-12 due to the loss of an Eco. I site. The pgEE1.2 subclone from g3715 is shown.
Figure 4 is a description of the EIN2 locus, the cDNA (bottom) is shown relative to the genomic map (top).
A putative TATA sequence is shown approximately 60 base pairs 5′ to the start of the cDNA. The position of the translation start and stop sites are also shown.
Figure 5 exhibits the sequence of the EIN2 locus.
Genomic DNA sequence (SEQUENCE ID NO: 1) is shown in lower case letters, cDNA sequence (SEQUENCE ID NO: 2) is shown in capitol letters. The predicted peptide sequence (SEQUENCE ID NO: 3) is displayed under the corresponding nucleic acid codons.
Figure 6 is a schematic illustration of the ethylene biosynthesis pathway.
Figure 7 depicts a seedling body and developing plant. Specifically, Figure 7A is a cross section of the seedling body of a seed plant. Figure 7B is a perspective view of a developing seed plant.
Figure 8 identifies the protein sequences of eill, ein3, eil2, eil3, and a common consensus protein sequence representing all four of the individual protein sequences.
I
WO 95/35318 PCT/US95/07744 8 Figure 9 displays the EIN3 gene structure and mutants. Also set forth in Figure 9 is the predicted polypeptide acidity and basicity, as well as Asn repeats.
Figure 10 exhibits a map of chromosome 3 and the position of EIN3 relative to other gene loci.
Figure 11 sets forth a map of chromosome 2 and the position of EIL1 relative to other gene loci.
Figure 12 displays a map of chromosome 5 and the position of EIL2 relative to other gene loci.
Figure 13 exhibits a map of chromosome 4 and the position of HLS1 relative to other gene loci.
Figure 14 is a representation of the arrangement of his mutants on chromosome 4.
Figure 15 identifies the protein sequences of Arabidopsis HLS1 and acetyl transferases in E. coli, Pseudomonas, Streptomyces, Mouse, Human, Azospirillum, Yeast, and Citrobacter. A consensus sequence representing common amino acids of the sequences is also provided.
Figure 16 displays ethylene responses in wild type and mutant: ctrl, etol, hisl, etr1, ein2, ein3, Arabidopsis seedlings. Seeds of the indicated genotype were germinated and grown for three days in the dark in either air or air containing 10 ppm ethylene.
Figure 17 is a genetic model of interactions among components of the ethylene signal transduction pathway. This model shows the predicted order in which the various gene products act which is based on the epistatic relationships among the mutants. The seedling ethylene responses are indicated on the right.
Figure 18 is a representation of pNLEIN3Bgl2 indicating the relationship between the promoter, GUS, and EIN3 sequences.
Figure 19 displays EIN3 sequences. Figure 19A sets forth EIN3 cDNA (SEQUENCE ID NO: Figure 19B sets forth EIN3 genomic DNA (SEQUENCE ID NO: and Figure 19C sets forth EIN3 protein sequence (SEQUENCE ID NO: 6).
WO 95/35318 PCT/US95/07744 9 Figure 20 displays EIL1 sequences. Figure sets forth EIL1 cDNA (SEQUENCE ID NO: Figure 20B sets forth EIL1 peptide sequence (SEQUENCE ID NO: 8).
Figure 21 displays EIL2 sequences. Figure 21A sets forth EIL2 cDNA (SEQUENCE ID NO: Figure 21B sets forth EIL2 peptide sequence (SEQUENCE ID NO: Figure 22 displays EIL3 sequences. Figure 22A sets forth EIL3 cDNA (SEQUENCE ID NO: 11). EIL3 peptide sequence is set forth in SEQUENCE ID NO: 12.
Figure 23 displays HLS1 sequences. Figure 23A sets forth HLS1 cDNA (SEQUENCE ID NO: 14), Figure 23B sets forth HLS1 genomic DNA sequence (SEQUENCE ID NO: 15), and Figure 23C sets forth HLS1 peptide sequence.
DETAILED DESCRIPTION OF THE INVENTION The present invention is directed to nucleic acid and amino acid sequences which lend valuable characteristics to plants.
The present invention is directed to nucleic acid sequences of the EIN2 locus. Wild type and mutant sequences of EIN2 are within the scope of the present invention. Amino acid and protein sequences corresponding to the nucleic acid sequences are included in the present invention. EIN2 mutations provide for ethylene insensitivity and pathogen tolerance in plants.
SEQUENCE ID NO: 2, the isolated cDNA representing the nucleic acid sequence coding for EIN2 and the isolated genomic EIN2 sequence of SEQUENCE ID NO: 1 are embodiments of the present invention. The purified amino acid sequence of SEQUENCE ID NO: 3 represents the EIN2 protein product encoded by the cDNA identified above. The EIN2 mutations identified herein by nucleotide position are measured in accordance with the beginning of the cDNA.
An ein2-3 mutation was created by X-ray mutagenesis which resulted in a thymidine insertion at nucleotide position 3642 of the cDNA sequence in SEQUENCE I WO 95/35318 PCT1VS95107744 10 ID NO: 2. A frameshift results in the corresponding amino acid sequence.
An ein2-4 mutation was also generated by X-ray mutagenesis. The ein2-4 mutation has an «AG» to «TTT» mutation at position 2103 of the EIN2 cDNA sequence resulting in a frameshift in the corresponding amino acid sequence.
An ein2-5 mutation was generated by X-ray mutagenesis, such that a deletion beginning at nucleic acid position 1570 of the cDNA occurred. Nucleic acids CATGACT were deleted. A frameshift results in the corresponding protein product.
An ein2-6 mutation has a deletion of nucleic acids GAGTTGCGCATG, SEQ ID NO: 17, beginning at nucleic acid position 965 of the cDNA sequence. The ein2-6 mutation was generated by Agrobacterium mutagenesis. This mutation results in a deletion at the amino acid level of Gly-Val-Ala-His, SEQ ID NO: 18, formerly beginning at amino acid position 115.
Another mutation, ein2-9 was generated by DEB mutagenesis and has an to transition at position 4048 that results in a «His» to «Pro» change at amino acid position 1143 in the corresponding protein.
ein2-11 was generated by DEB mutagenesis and has a «TG» to «AT» transition at nucleic acid position 3492.
This results in an Ochre stop signal at amino acid position 957 in the protein.
An ein2-12 mutation was obtained by X-ray mutagenesis resulting in a deletion at nucleic acid position 1611 of nucleic acids TGCTACAATCAGAATTCTTGCAGT, SEQ ID NO: 19. The corresponding amino acid sequence reveals a deletion of amino acids Ala-Thr-Ile-Arg-Ile-Leu- Ala-Val, SEQ ID NO: 20, beginning at amino acid position 331.
An ein2-16 mutation results in an «AGT» to «G» transition at nLucleic acid position 2851 as a result of X- WO 95/35318 PCT/US95/07744 11 ray mutagenesis. A frameshift results in the corresponding protein.
Table 4 sets forth the EIN2 alleles and the results of the mutagenesis.
Ein3 sequences for genes and proteins are the subject of the present invention. The present invention is directed to wild type nucleic acid and amino acid sequences as well as mutations of these sequences. EIN3 mutations result in ethylene insensitive plants. Ein-like genes and protein sequences, including eill, eil2, and eil3 sequences, -re similar to ein3 sequences, and are also disclosed in the present invention. The EIN3 mutations are identified below by nucleotide position number in accordance with the beginning of the genomic DNA sequence.
The DNA sequences coding for ein3 are set forth in SEQ ID NOS: 5 (genomic) and 4 (cDNA). The amino acid sequence may be found in SEQ ID NO: 6.
In ein3-1, a to conversion in the genomic DNA at nucleotide 1598 occurs as a result of EMS mutagenesis. In the corresponding protein, is changed to a stop codon at amino acid position 215. The ein3-2 mutation was generated by T-DNA insertion mutagenesis. The T-DNA inserted after nucleotide 2001 of the genomic, interrupting the protein after amino acid 349. The ein3-3 mutation results in a to switch at nucleotide position 1688 of genomic DNA as a result of DEB mutagenesis. The amino acid sequence results in a conversion of to at amino acid position 245.
The cDNAs of eill, eil2, and eil3, are set forth in SEQ ID NOS: 7, 9, and 11, respectively. The corresponding amino acid sequences for the ein-like genes are set forth in SEQ ID NOS: 8, 10, and 12, (eill, eil2, and eil3, respectively). A consensus sequence representing the common codons of the three ein-like genes is SEQ ID NO: 13.
Table 6 sets forth the EIN3 alleles and the results of the mutagenesis. The translation start site of I WO 95/35318 I’CT/US95/07744 12 EIN3 i’ at nucleotide position 954 of the genomic sequence.
the translation start sites for EIL1, EIL2, and EIL3 are at nucleotide positions 251, 8, and 102 of the respective cDNA sequences.
The present invention is directed to wild type and mutant sequences for the Hisl locus. The his gene is regulated by ethylene directly. Amino acid and protein sequences corresponding to the wild type and mutant gene for Hlsl are within the scope of the present invention.
The present invention is directed to nucleic acid sequences of the HLS1 locus. Wild type and mutant sequences of HLS1 are within the scope of the present invention. Amino acid and protein sequences corresponding to the nucleic acid sequences are included in the present invention. The HLS1 mutations are identified below by nucleotide position number in accordance with the beginning of the genomic DNA sequence.
SEQUENCE ID NO: 14, the isolated cDNA representing the nucleic acid sequence coding for HLSI, and the isolated genomic HLS1 sequence of SEQUENCE ID NO: are embodiments of the present invention. The purified amino acid sequence of SEQUENCE ID NO: 16 represents the HLS1 protein product encoded by the cDNA identified above.
An hlsl-1 mutation was created by EMS mutagenesij which resulted in a to transition at nucleotide position 3487 of the genomic DNA sequence. This frameshift results in the corresponding amino acid sequence having a «Glu» to «Lys» substitution at amino acid position 345.
An hlsl-5 mutation of was generated by DEB mutagenesis. The hlsl-5 mutation has an to «A» mutation at position 2194 of the HLSI genomic DNA sequence, resulting in a mutation in the splice donor site. An hlsl- 7 mutation was also created by DEB and resulted in a to transition at nucleic acid position 2194. The result in the amino acid sequence is also a mutation in the splice donor site. Mutations at splice donor sites often result in aberrant splicing causing a frameshift or insertion to WO 95/35318 PCTI’S95/077,1, 13 occur. The exact nature of the change ii. isl-5 and hlsl-7 may be determined by analyzing the protein from those mutants using an antibody.
hlsl-6 is a mutation created by EMS resulting in a to transition at nucleic acid position 3431. The corresponding amino acid sequence has a «Lys» to «Trp» substitution at amino acid position 326.
The mutation hlsl-4 was created by DEB mutgenesis resulting in a to transition at nucleic acid position 3487. The corresponding amino acid sequence has a «Glu» to «Lys» change at amino acid position 345.
hlsl-9 is created by EMS mutagenesis. The sequence results in to at nucleic acid position 2060, which corresponds to an «Arg» to «TGA» creating a «stop signal» at amino acid position 11.
hlsl-8 is a mutation resulting from EMS mutagenesis. The nucleic acid sequence has a to «T» change at positio.. 2992. The mutation results in an amino acid sequence having an «Arg» to «Stop» transition at amino acid position 180.
An EMS mutation resulting in a to change at nucleic acid position 2033 is represented by hls1-10.
The amino acid sequence corresponding to the mutation reveals a «Met» (Start signal) to «Ile» transition at amino acid position 1.
Table 7 sets forth the HLS1 alleles and the results of the mutagenesis.
In accordance with the present invention, nucleic acid sequences include and are not limited to DNA, including and not limited to cDNA and genomic DNA; RNA, including and not limited to mRNA and tRNA; and suitable nucleic acid sequences such as those set forth in SEQUENCE ID NUMBERS set forth herein, and alterations in the nucleic acid sequences including alterations, deletions, mutations and homologs. In addition, mismatches within the sequences identified above, which achieve the methods of the invention, are also considered within the scope of the WO 95/3531 W COrIUS95O7744 14 disclosure. The sequences may also be unmodified or modified.
Also amino acid, peptide and protein sequences within the scope of the present invention include, and are not limited to, the sequences set forth herein and alterations in the amino acid sequences including alterations, deletions, mutations and homologs.
In accordance with the invention, the nucleic acid sequences employed in the invention may be exogenous/heterologous sequences. Exogenous and heterologous, as used herein, denotes a nucleic acid sequence which is not obtained from and would not normally form a part of the genetic make-up of the plant or the cell to be transformed, in its untransformed state. Plants comprising exogenous nucleic acid sequences of ein2, ein3, eill, eil2, eil3, or hlsl mutations, such as and not limited to the nucleic acid sequences of SEQUENCE ID NUMBERS set forth herein are within the scope of the invention.
Transfected and/or transformed plant cells comprising nucleic acid sequences of ein2, ein3, eill, eil2, eil3, or hlsl mutations, such as and not limited to the nucleic acid sequences of SEQUENCE ID NUMBERS set forth herein, are within the scope of the invention. Transfected cells of the invention may be prepared by employing standard transfection technii .es and procedures as set forth in Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd ed., 1989, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, hereby incorporated by reference in its entirety.
In accordance with the present invention, mutant plants which may be created with the sequences of the claimed invention include higher and lower plants in the Plant Kingdom. Mature plants and seedlings are included in the scope of the invention. A mature plant includes a plant at any stage in development beyond the seedling. A WO 95/35318 P’CT/US95/07744 15 seedling is a very young, immature plant in the early stages of development.
Particularly preferred plants are those from: the Family Umbelliferae, particularly of the genera Daucus (particularly the species carota, carrot) and Apium (particularly the species graveolens dulce, celery) and the like; the Family Solanacea, particularly of the genus Lycopersicon, particularly the species esculentum (tomato) and the genus Solanum, particularly the species tuberosum (potato) and melongena (eggplant), and the like, and the genus Capsicum, particularly the species annum (pepper) and the like; and the Family Leguminosae, particularly the genus Glycine, particularly the species max (soybean) and the like; and the Family Cruciferae, particularly of the genus Brassica, particularly the species campestris (turnip), oleracea cv Tastie (cabbage), oleracea cv Snowball Y (cauliflower) and oleracea cv Emperor (broccoli) and the like; the Family Compositae, particularly the genus Lactuca, and the species sativa (lettuce), and the genus Arabidopsis, particularly the species thaliana (Thale cress) and the like. Of these Families, the most preferred are the leafy vegetables, for example, the Family Cruciferae, especially the genus Arabidopsis, most especially the species thaliana.
Ein2 mutant sequences render plants disease and pathogen tolerant, and ethylene insensitive. For purposes of the current invention, disease tolerance is the ability of a plant to survive infection with minimal injury or reduction in the harvested yield of saleable material.
Plants with disease tolerance may have extensive levels of infection but have little necrosis and few to no lesions.
These plants may also have reduced necrotic and water soaking responses and chlorophyll loss may be virtually absent. Tn contrast, resistant plants generally limit the growth of pathogens and contain the infection to a localized area with multiple apparent injurious lesions.
WO 95/35318 PCT/US95/07744 16 The current invention is directed to, for example, identifying plant tolerance to bacterial infections including, but not limited to Clavibacter michiganense (formerly Coynebacterium michiganense), Pseudomonas solanacearum and Erwinia stewartii, and more particularly, Xanthomonas campestris (specifically pathovars campestris and vesicatoria), Pseudomonas syringae (specifically pathovars tomato, maculicola).
In addition to bacterial infections, disease tolerance to infection by other plant pathogens is within the scope of the invention. Examples of viral and fungal pathogens include, but are not limited to tobacco mosaic virus, cauliflower mosaic virus, turnip crinkle virus, turnip yellow mosaic virus; fungi including Phytophthora infestans, Peronospora parasitica, Rhizoctonia solani, Botrytis cinerea, Phoma lingam (Leptosphaeria maculans), and Albugo candida.
Like ein2, ein3 mutants also exhibit ethylene insensitivity. However, ein3 mutants do not exhibit disease or pathogen tolerance. Ethylene, CH,=CH,, is a naturally occurring plant hormone. The ethylene regulatory pathway includes the ethylene biosynthesis pathway and the ethylene autoregulatory or feedback pathway, see Figure 6.
In the ethylene biosynthesis pathway, methionine is converted to ethylene with S-adenosylmethionine (SAM) and l-aminocyclopropane-l-carboxylic acid (ACC) as intermediates. These two reactions are catalyzed by ACC synthase and ethylene-forming enzyme (EFE), respectively.
Little is known about the enzymes catalyzing these reactions and their regulation at the molecular level.
The receptor and receptor complex of Figure 6 are believed to function with the autoregulatory pathway in the control of ethylene production. Ethylene regulatory pathway inhibitors are positioned along the left side of Figure 6. The inhibitors include AVG (aminoethoxyvinylglycine) and AIB (a-aminoisobutyric acid). The steps at which the mutants, ethylene overproducer (etol), ethylene WO 95/35318 PCITUS95/177-14 17 insensitive (einl, ein2) and hookless (hlsl), are defective appear on the right of Figure 6.
In accordance with the claimed invention, ethylene insensitive plants are those which are unable to display a typical ethylene response when treated with high concentrations of ethylene. For purposes of the present invention, ethylene insensitivity includes total or partial inability to display a typical ethylene response. A typical ethylene response in wild type plants includes, for example, the so-called «triple response» which involves inhibition of root and stem elongation, radial swelling of the stem, and absence of normal geotropic response (diageotropism). Thus, for example, ethylene insensitive plants may be created in accordance with the present invention by the presence of an altered «triple response» wherein the root and stem are elongated despite the presence of high concentrations of ethylene. Further, a typical ethylene response also includes a shut down or diminution of endogenous ethylene production, upon application of high concentrations of ethylene. Ethylene insensitive plants may thus also be screened for, in accordance with the present invention, by the ability to continue production of ethylene, despite administration of high concentrations of ethylene. Such ethylene insensitive plants are believed to have impaired receptor function such that ethylene is constitutively produced despite the presence of an abundance of exogenous ethylene.
Screening includes screening for root or stem elongation and screening for increased ethylene production.
Ethylene sensitive wild type plants experience an inhibition of root and stem elongation when an inhibitory amount of ethylene is administered. By inhibition of root and stem elongation, it is meant that the roots and stems grow less than the normal state (that is, growth without application of an inhibitory, amount of ethylene).
Typically, normal Arabidopsis (Col) grown without ethylene or ethylene precursor aminocyclopropane, ACC, root WO 95/35318 PICT/US95/07744 18 elongation is about 6.5 0.2 mm/3 days; normal stem elongation is 8.7 0.3 mm/3 days. Ein 2-1 plants grown without ethylene or ACC have root elongation of about 7.5 0.2 mm/3 days and stem elongation of 11.35 0.3 mm/3 days.
In the presence of 100 gm ACC, Col root growth is 1.5 0.04 mm/3 days; ein 2-1 is 4.11 0.1 mm/3 days and stem growth of 3.2 0.1 mm/3 days for Col and 8.0 0.2 mm/3 days for ein 2-1. Alternatively, plants may be sprayed with ethaphon or ethrel. By roots, as used here, it is meant mature roots (that is, roots of any plant beyond the rudimentary root of the seedling), as well as roots and root radicles of seedlings. Stems include hypocotyls of immatu;re plants of seedlings and stems, and plant axes of mature plants (that is, any stem beyond the hypocotyl of seedlings). See Figure 7A and Figure 7B.
Ethylene sensitive wild type plants experience a shut down or diminution of endogenous ethylene production, upon application of high concentrations of ethylene. In the ethylene insensitive plants of the present invention, the plants continue endogenous production of ethylene, despite administration of inhibitory amounts of ethylene.
Ethylene production for wild type and ethylene insensitive mutants are shown in Table 1. An ethylene insensitive plant will produce an amount or have a rate of ethylene production greater than that of a wild type plant upon administration of an inhibitory amount of ethylene. As one skilled in the art will recognize, absolute levels of ethylene produced will change with growth conditions.
Einl and ein2 mutants are described for example in, Guzman et al., «Exploiting the Triple Response of Arabidopsis to Identify Ethylene-Related Mutants», The Plant Cell 1990, 2, 513, the disclosures of which are hereby incorporated herein by reference, in their entirety.
The present invention is further described in the following examples. These examples are not to be construed as limiting the scope of the appended claims.
I WO 95/35318 PCTUS95/07744 19 EXAMPLE 1 PRODUCTION OF Arabidopsis MUTANTS The production of plants which exhibit enhanced disease tolerance and ethylene insensitivity were investigated with the use of Arabidopsis mutants ein, which are insensitive to ethylene and are derived from Arabidopsis Col-0. The ein mutants were prepared according to the method of Guzman et al., The Plant Cell, 1990, 2, 513, the disclosures of which are hereby incorporated herein by reference, in their entirety. Specifically, twenty five independent ethylene-insensitive mutants were isolated; six mutants which showed at least three-fold difference in the length of the hypocotyl compared with ethylene-treated wild-type hypocotyl, were further characterized. In these mutants, the apical hook was either present, absent or showed some curvature in the apical region. The appearance of the apical curvature was dependent on the duration of the incubation. After more than 3 days of incubation in the dark with 10 pl/L ethylene, the apical curvature was absent. This phenotype was named «ein» for ethylene insensitive.
Mendelian analysis indicated that insensitivity to ethylene was inherited as either a dominant or recessive trait depending on the mutation studied. Complementation analysis was performed with five recessive mutants to determine whether more than one locus was involved in this phenotype. The results of these studies indicated that all five recessive mutations were allelic. The ein phenotype was tested for linkage to nine visible markers to determine whether the recessive and dominant ein mutations were allelic. The dominant ein mutation was mapped close to the mutation ap-1 locus on chromosome 1 and was named einl-l.
None of the nine markers showed linkage to the recessive ein mutation. Restriction fragment length polymorphism (RFLP) analysis was performed to map this mutation.
Randomly selected RFLP probes were initially used to assess linkage. After testing probes from three different WO 95/35318 1PCTIUS95/07744 20 chromosomes, linkage was detected to one RFLP from chromosome 4 and named ein2-1. This observation was confirmed using additional RFLP probes from the same chromosome. Further experimentation confirmed ein2-2, ein2-3, ein2-4 and ein2-5 to be alleles of ein2-1.
Growth features of ethylene insensitive mutants were also observed. After seedlings were planted in soil and cold treated at 4°C for 4 days, the seedlings were incubated in the dark at 23 0 C for 66-72 hours. Plants were grown to maturity in a growth chamber at 22 0 C to 25°C under continuous illumination with fluorescent and incandescent light. The rosette of einl-1 and ein2-l plants was larger compared with the wild type, Col-0, rosette and a delay in bolting (1 cm to 2 cm growth in the length of the stem) was observed. These observations indicated that the ethylene insensitive mutations identified at the seedling stage exerted remarkable effects during adult stages of growth.
eto mutants, which constitutively produce ethylene, were initially screened by observing a constitutive triple response; seedlings with inhibition of hypocotyl and root elongation, swelling of the hypocotyl and exaggerated tightening of the apical hook. Mendelian segregation analysis determined the genetic basis of these mutations to be a single recessive mutation and identified as an ethylene overproducer or eto.
etol, einl and ein2 mutants were analyzed to determine ethylene accumulation. The mutants were backcrossed to the wild type before physiological examination. Surface-sterilized seeds (about 500) were germinated and grown for 66 to 72 hours in the dark at 23°C in 20 ml gas chromatograph vials containing 15 ml of growth medium.
To measure the conversion of exogenous 1aminocyclopropane-l-carboxylic acid (ACC, an intermediate in ethylene production) to ethylene, seedlings were grown in 1% low-melting-point agarose buffered with 3 mM Mes at pH 5.8. In this solid support no chemical formation of IWO 95/354318 PCT/US95/07744 21 ethylene from ACC was detected at any of the concentrations of ACC employed.
Ethylene accumulation from tissues of mature plants (100 mg) was measured after overnight incubation in 20 ml gas chromatograph vials. Leaves and inflorescence were taken from 24-28 day old plants, siliques from 32-36 day old plants. Accumulation of ethylene was determined by gas chromatography using a photo-ionization detector (HNU) and a Hewlett Packard HP5890A gas chromatograph equipped with an automated headspace sampler. A certified standard of 10 /p/L ethylene (Airco) was used to calculate ethylene concentrations. The concentration of the inhibitors of ethylene biosynthesis and ethylene action was determined empirically. For eto mutants, AVG, a-aminoisobutyric acid, and AgNO, supplemented the media at 5AM, 2mM and 0.1 mM, respectively and trans-cyclooctene (17il/L) was injected into the vial after the cold treatment. Ethylene production was increased significantly in the dominant einl-1 mutant and the recessive ein2-1 mutant, see Table 1.
Ethylene production was inhibited in etol-1 seedlings that were grown in media supplemented with ethylene inhibitors aminoethoxyvinylglycine, AGV and a-aminoisobutyric acid, AIB, see Table 1.
The EIL sequences represent cDNA sequences similar to the EIN3 sequence. They were obtained by screening an Arabidopsis seedling cDNA library (Kieber et al., Cell, 1993, 72, 427-441, at low stringency in the following manner. The cDNA library was hybridized with the radiolabeled EIN3 cDNA insert at 420 C for 48 hours in a hybridization solution consisting of 30% formamide, Denhardt’s solution, 0.5% SDS, 5X SSPE, 0.1 mg/ml sheared salmon sperm DNA, according to the methods of Feinberg and Vogelstein, Anal. Biochem. 1984, 177, 266-267, incorporated herein by reference in its entirety. The filters were washed at 420 C with 30% formamide, 0.5% SDS, SSPE; followed by 2X SSPE.
111 WO 95/35318 PCT/US95/07744 22 Mutageneized HLS1 plants were obtained as set forth above for EIN2, EIN3, and SIL.
Table 1 Ethylene Production in Triple Response Mutants Strain Ethylene Accumulation Wild Type Etiolated Seedlings 6.7 0.68 nL Light-grown Seedlings 84.25 13.95 nL Leaves 73.01 17.64 nL/g Siliques 144.96 28.99 nL/g Inflorescence 234.53 18.04 nL/g etol-1 Etiolated Seedlings 276.72 53.70 nL Light-Grown Seedlings 182.01 24.84 nL Leaves 174.39 29.18 nL/g Siliques 322.16 38.66 nL/g Inflorescence 1061.84 72.16 nL/g hlsl-1 Etiolated seedlings 5.81 0.32 nL Leaves 31.56 0.32 nL einl-1 Etiolated Seedlings 12.73 2.79 nL Leaves 222.95 2.79 nL ein2-1 Etiolated Seedlings 20.69 2.09 nL Leaves 135.59 26.89 nL/g WO 95/35318 PCT/US95/07744 23 Another ethylene insensitive mutant of Arabidopsis thaliana was designated etr by Bleecker et al.
in «Insensitivity to Ethylene Conferred by a Dominant Mutation in Arabidopsis thaliana», Science 1990, 241, 1086, the disclosures of which are hereby incorporated herein by reference, in their entirety. Etr was identified by the ethylene-mediated inhibition of hypocotyl elongation in dark-grown seedlings. Populations of M 2 generation from mutagenized seed of Arabidopsis thaliana were plated on a minimal medium solidified with 1% agar and placed in a chamber through which 5 Il/L ethylene in air was circulated. Seedlings that had grown more than 1 cm after 4 days were selected as potential ethylene insensitive mutants. A screen of 75,000 seedlings yielded three mutant lines that showed heritable insensitivity to ethylene.
Hypocotyl elongation of etr mutant line was unaffected by ethylene at concentrations of up to 100 1/L, while elongation of the wild type was inhibited by 70% with ethylene at 1 gl/L.
EXAMPLE 2 CLONING AND SEQUENCING OF EIN2 The EIN2 locus was identified by a mapped based cloning strategy described as follows. The ein2-1 mutant was crossed onto the DP28 marker line (disl, clv2, er, according to the methods of Koornneef and Stamm, Methods in Arabidopsis Research, eds. C. Koncz, N-H Chua, and J.
Schell, 1992, World Scientific Publishing Co., Singapore, incorporated herein by reference in its entirety. The F2 progeny were mapped with Restriction Fragment Length Polymorphisms (RFLPs) according to the methods of Chang et al., Proc. Natl Acad. Sci. USA 1988, 85, 6856 and Nam et al., Plant Cell 1990, 1, 699, the disclosures of which are hereby incorporated by reference in their entirety.
The ein2-1 mutation was found to segregate with RFLPs on the top of chromosome five (Table Two recombinant progeny found with X217 (E15 and E54) were also WO 95/35318 PCT/US95/0774 24 recombinant with the more proximal g3837 and X291 clones, indicating that ein2-l is distal to X217. Recombinant plants were identified by examining F 3 families from the ein2-1 x DP28 cross for the genotype at the X217 locus.
Protocols are the same mapping with RFLPs. Recombinants were defined by having at least one recombinant chromosome in an ein2-1 homozygote. The Ubq6121 marker, however, identified a different F2 progeny (E46) ac being recombinant. This positions ein2 within the interval of X217 and Ubq6121. To further limit the position of ein2 on the top of chromosome 5, recombinants were sought with the PCR based marker ATHCTR1, Bell et al., Methods in Plant Molecular Biology: A Laboratory Manual, 1993, eds. Maliga, Klessig, and Cashmore, Cold Spring Harbor Laboratory Press, the disclosure of which is hereby incorporated by reference in its entirety.
A single recombinant progeny was identified in 102 F2 progeny scored. This F2 progeny was also recombinant at the proximal X217 and ASA1 markers, demonstrating the position of ein2 as distal to ATHCTR1.
Additional genetic information was generated by examining recombinant progeny from a cross between ein2-l and Two additional recombination events between ein2-l and ATHCTR1 were identified by this approach. There were no recombinant plants identified at the g3715 locus, a cosmid clone identified in Nam et al., supra.
WO 95/35318 .WO 9535318 CT/US95/077441 25 Table 2 Plants Having ein2 Mutation Characterization of ALLELE HYPOCOTYL SE ROOT JSE TL SE Columbia 3.6 0.2 1.6 10.1 5.2 0.2 Landsberg 3.2 0.1 1. 0.1 4.9 0.2 Wassilewskija 2.7 0.1 0.9 0.1 3.6 01 ein2-1 6.0 0.3 17.1 0.1 13.1 0.4 ein2-3 8.2 0.7! 5.9 0.3 14.1 0.4 ein2-4 7.5 0.2 6.3 0.4 13.8 8.4 0.2 7.2 0.5 15.6 ein2-6 8.8 9.4 5.4 0. 2 14.2 ein2-7 5.9 0.1 3.8 01 -9.7 0.2 ein2-.9 7.3 0.2 5.5 0.2 12.8 0.3 ein2 -20 6.4 0.1 4.7 0.4 11.1 ein2 -11 8.1 0.1 7.7 0.3 15.8 0.4 e 4. ?-12 6.5 0.3 4.4 0.3 9 0.4 ein2-13 5.4 0.2 3.7 0.2 9.1 0.4 ein2 -215 6.9 0.5 5.3 0.4 12.2 0.9 ein2 -16 8.1 0.3 7.7 0.6 15.8 0.7 ein2-3J8 6.2 0.2 6.5 0.4 12.7 4 ein2-19 7.1 0.2 6.2 0.5 13.3 0.6 e-in2-20 5.8 02 52 .2 11.0 0.
All units are in imm, TL Total Length, SE Guzman and Ecker, Plant Cell 1990, 2, 513.
Gift of Caren Chang and Elliot Meyerowitz, Standard Error Pasadena, CA.
.WO 95/35318 PCTI/US95/07744 26 The flanking genetic markers were used to build a Yeast Artificial Chromosome (YAC) physical contig spanning the ein2 locus (Figure The YAC positions were identified by colony hybridization pursuant to the technique of Matallana, et al., Methods in Arabidopsis Research, eds C. Koncz, N-H Chua, and J Schell, 1992, World Scientific Publishing Co., Singapore, the disclosures of which are hereby incorporated by reference in their entirety.
YAC clones are replicated in the yeast cells as authentic chromosomes and so they are present as only one copy per cell. This is an important difference with bacterial colony hybridization and makes colony filter treatment a critical step for successful sequence detection. After growing colonies overnight on the filters, the cell walls were digested and the spheroplasts were lysed in order to prepare yeast DNA for hybridization.
Yeast cell wall digestion is stimulated by reducing agents, such as 2-mercaptoethanol or DTT, that modify the wall structure and make it more sensitive to enzymatic action. Colony filters were placed on filter paper soaked in 0.8% DTT in SOE buffer (1 M sorbitol, 20 mM EDTA, 10 mM Tris-acetate pH 8.0) for 2-3 min. before transferring them to filter paper soaked in SOE containing 1% 2-mercaptoethanol and 1 mg/ml Zymolyase 10-T in individual 150 X 15 mm petri dishes. Petri dishes were parafilmed and stacked in a sealed plastic bag and incubated at 370 C overnight.
After spheroplasting, lysis was carried out by placing the filters on whole sheets of Whatman 3MM paper soaked in the appropriate solution. The 3MM sheets were placed on Saran wrap and soaked immediately before use.
The filters were treated as follows: 1. 10% SDS for 10 min.; 2. 0.5 M NaOH for 10 min (1.5 NaCI should be included for Hybond Repeat; 3. Air dry for 5 min.; .WO 95/353 IS PCT/US95/07744 27 4. 1 M Tris-HCl (pH 1.5 M NaCI for at least 5 min; 0.1 M Tris-HCl (pH 0.15 M NaC1 for at least 5 min. Cell debris on the filters was eliminated by gently wiping the filters with Kimwipes soaked in the same solution.
6. 2xSSPE for at least 5 min. This step precedes hybridization. Following lysis, the filters are air dried for 30 min. and baked for 2 hours at 80 C.
The left ends of the identified YAC clones were isolated by plasmid rescue according to Bell et al., 1994.
Right ends were isolated by either vectorette PCR according to the methods of Matallana, et al., 1992, supra. or inverse PCR as described by Bell, et al., 1994, supra, the disclosures of which are hereby incorporated by reference in their entirety. The yUP library appeared to be missing clones corresponding to ATHCTR1; three clones hybridizing to this locus were found within the EG library (Grill and Somerville, Mol. Gen. Genet. 1991, 226, 484, incorporated herein by reference in its entirety.) The pEG23G5L left end plasmid rescue hybridizes to useful EcoR I and Xba I polymorphisms and hybridizes to the same lambda clone as ATHCTR1 (Xctg24; Kieber et al., Cell 1993, 72, 427, incorporated herein by reference in its entirety). The left end rescue pyUP2G1lL hybridizes to EG23G5, linking the Ubq6121/g3715 and ATHCTR1 clones into a contiguous array.
pyUP2GlL also contains a Bgl II polymorphism that is informative in the ein2-1 X DP28 cross. The three plants that are recombinant at ATHCTR1 are also recombinant at pyUP2G1L; this indicates the position of ein2 is distal to this YAC end (Figure 1).
To facilitate the identification of the ein2 locus, 24 alleles were identified (Table 1; Guzman and Ecker, Plant Cell 1990, 2, 513, incorporated herein by reference in its entirety.) Many of these alleles were generated by X-ray or diepoxybutane mutagenesis; these mutagens are known to create polymorphisms that are WO 95/35318 IPCT/US95/07744 28 detectable by hybridization to a genomic Southern blot (Clark, et al., Genetics 1986, 112, 755; Reardon et al., Genetics 1987, 115, 323, incorporated herein by reference in their entirety). EcoR I, HinD III, BamH I, Bgl II, and Sal I genomic Southern blots were made to find such a polymorphism in the mutant alleles of ein2. The following probes that mapped between Ubq6121 and yUP2G11L were hybridized to the genomic allele blots: Ubq6121, EG19A1OL, yUP2G1lR, g3715, yUP19E11L, EG23G5R, and yUP2Gl1L. The cosmid clone g3715 hybridized to a restriction fragment length polymorphism i ein2-12 that corresponds to a lost EcoR I site (Figure Based on this missing EcoR I site, this region was examined further.
The 1.2 kb EcoR I fragment that corresponds to one of the missing bands in ein2-12 was subcloned from g3715 into pKS (Stratagene, LaJolla, CA) this clone is named pgEE1.2 (Figure The pgEE1.2 insert was used to isolate 22 cDNA clones made from ethylene treated three-day old etiolated Arabidopsis thaliana seedlings (Kieber, et al. 1993, supra.) pgEEl.2 was also used to identify a single genomic lambda clone, XgE2, from a XDASH II library made from adult Columbia plants. The XgE2 clone spanned the 5′ end of the locus and terminated within the 3′ end of the cDNA. Initially the pcE2.5 clone was sequenced but since this clone was not full length, the 5′ ends of pcE2.17, pcE2.20, and pcE2.22 (Kieber, et al. 1993) were sequenced to determine the structure of the full length frame and ending within 60 bp from a putative «TATA» box (Figure Using 5 pg of poly(A+) RNA from 3-day old dark-grown, ethylene-treated Arabidopsis seedlings (hypocotyls and cotyledons) as template and oligo(dT) as primer, first-strand cDNA synthesis was catalyzed by Moloney murine leukemia virus reverse transcriptase (Pharmacia) for construction of the Arabidopsis cDNA expression library. Second-strand cDNA was made as described by Gubler and Hoffman, Gene 1983, 25, 263, which is hereby incorporated by reference in its entirety, except WO 95/35318 PCT/US95/07744 29 that E. coli DNA ligase was omitted. After the secondstrand reaction, the ends of the cDNA were made blunt with Klenow fragment, and EcoR I-Not I adaptors (Pharmacia) were ligated to each end. The cDNA was purified from unligated adaptors by spun-column chromatography using Sephacryl S- 300 and size fractionated on a 1% low melting point minigel. Size-selected cDNAs 1-2, 2-3, and 3-6 kb) were removed from the gel using agarose (New England BioLabs), phenol-chloroform extracted, and precipitated using 0.3M NaOAc (pH 7)-ethanol. A portion of each cDNA size fraction (0.1 gg) was coprecipitated with 1 g of XZAPII EcoR I-digested, dephosphorylated arms and then ligated overnight in a volume of 4 gl. Each ligation mix was packaged in vitro using Gigapack II Gold packaging extract (Stratagene). The structure of this locus was determined by Southern hybridization and restriction mapping of the XgE2 and g3715.
The sequence of the EIN2 genomic DNA was determined from PCR products and the XgE2 genomic lambda clone. Primers were selected from the sequence of the pcE2.17, and genomic subclones of XgE2.
The primers were then commercially synthesized (Research Genetics, Huntsville, AL).
,WO 95/35318 WO 9535318PCT/US95107744 30 Table 3 PRIMERS FOR THE EIN2 LOCUS SEQUENCE Primer Sequence position ID NO. Nam~e 21 PE2.7A GGATCCTCTAGTCAAATTACCGC 22 PE2 .7B AGATCTGGTATATTCCGTCTGCAC 23 PE2.5′ CCGGATTCGGTTTGTAGC PCR/ 3′ end 24 PEl GACGTGCATGTTCTTGGG PE2 GAAAGCCACATCACCTGC 26 PE3 GGGGTGGAGTTATCCAC 27 PE4 GACACCGGGAAGTATCG 28 PE5 CTGCTTTCATAGAAGAGGC PCR/ middle 29 PE6 GTCAGAACAAACCTGCTCC PCR/ end PE7 CACCCAGGTCTTGGTGG 31 PE8 GGCCGCCATGGATGCG 32 PE9 TCTCAATCAAGAGGAGGC 33 PElGA CTTGAAGGATCCGAGTGG 34 PEll CAGGTTGGCGAGTTCCTCG PE12 CTTGCTGTTATTCTCCATGC 36 PE13 CCCTGGACCAGCTCCTGG 37 PE14 TGGCGCAAGCATCGTCCC PCR/ middle 38 PE15 AAATGTTCAGGAATCTCTCG 39 PE16 CTGGCTGGCAGCCACGCC PCR/ 3′ end
I
NO 95/3531’0 WO 9S353″PCT’/US95/07744 31 PE17 GCGTTCTCAAAGCTGCGG 41 PE18 ACTGATGGGTCTTCTGGG 42 PE19 GGATCAGGATGGACCCGG 43 PE20 TGGTTGCTGAAGCCAGGG 44 PE21 TCCATTCATAGAGAGTGGG PE22 ATGCCCAAGAACATGCACG 46 PE23 CAACTGATCCTTTACCCTGC 47 PE24 GTTGTTAGGTCAACTTGCG PCR/ end 48 PE25 CTCTGTTAGGGCTTCCTCC 49 PE26A GAATCAGATTTCGCGAGG PE27 GTCCAAATGGAGGAAGCC 51 PE28 CCACGACTGTACAATTGACCTTG engineered Muni site 52 PE29 CATGATCGCAAGTTGACC 53 PE30 AGAAAACTCTTATCAAGCTACG 54 PE31 AAGCTTATGGGTGCTCGTGC PE32 GGAAAGAGAGAAAGACTCAG 56 PE33 j XACCAAGTCATACCCG Primer sequ~ences are set forth 5′ to 3′.
.WO 95/35318 PCT/US95/07744 32 Four overlapping regions of the ein2 locus between 1.2 and 3.2 kb in length were rapidly amplified by polymerase chain reactions (Idaho Technologies, Idaho falls, Idaho). Conditions for the PCR reactions are as follows: 92 0 C, 2 seconds; 56 0 C, 2 seconds; 72°C, 1 minute; cycles. Between 200 and 500 ng of these PCR products were directly sequenced on the ABI373A automated sequencer using Taq Dye-Terminator chemistry (Applied Biosystems Division, PEC). The genomic sequence of the wild type Columbia EIN2 locus is shown in Figure 5. Eight mutant alleles of ein2 were also sequenced and the corresponding mutations identified (Table The presence of these mutations in the mutant alleles of ein2 confirms the identity of this gene as EIN2.
Table 4 INDENTIFIED MUTATIONS OF EIN-2 ALLELE MUTAGEN MUTATION POSITION* RESULT ein2-3 X-ray Insert T +3642 Frameshift ein2-4 X-ray AG to TT +2103 Frameshift X-ray ACATGACT +1570 Frameshift ein2-6 Agro- AGAGTTGCGC +965 AGVAH bacterium ATG (115) (SEQ ID (SEQ ID NO: 17) NO: 18) ein2-9 DEB A to C +4048 H to P ein2-11 DEB TG to AT +3492 Ochre ein2-12 X-ray ATGCTACAAT +1611 AATIRILAV CAGAATTCTT (SEQ ID GCAGT NO: (SEQ ID NO: 19) ein2-16 X-ray AGT to G +2851 Frameshift WO 95/35318 PCT/US95/07744 33 Position relative to the start of pcE2.17; see Figure nucleic acid; position 1 corresponds to the beginning of the cDNA.
EXAMPLE 3 CLONING AND SEQUENCING OF EIN3 In order to clone the EIN3 gene a collection of 5000 T-DNA insertion lines (Feldmann and Marks, Mol. Gen.
Genet. 1987, 208, 1-9, incorporated herein by reference in its entirety) was screened for ethylene-insensitive mutants. A mutant with a phenotype similar to that of ein3-l (an EMS generated allele) was identified and genetic complementation tests revealed that ein3-1 and the T-DNA insertion mutant (designated ein3-2) were allelic.
Complete cosegregation of the mutant phenotype and the dominant kanamycin resistance marker on the T-DNA indicated that the T-DNA insertion was located within, or at least very close, to the EIN3 gene. Genomic DNA flanking the T-DNA insert was cloned using the left border rescue technique. Genomic Southern blots of wild-type and ein3-2 DNA hybridized with the rescued fragment indicated that the cloned segment of Arabidopsis DNA corresponded to sequences disrupted by the T-DNA insert and did not result from cloning an unlinked fragment of genomic DNA. In all restriction digests the mobility of the hybridizing fragments is shifted in the insertion mutant relative to wild-type.
cDNA and genomic libraries constructed from wild-type DNA were screened with the rescued DNA fragment.
The cDNAs obtained indicated the the EIN3 gene encodes a 628 amino acid open reading frame. Structural features of the predicted poly peptide include: 1) a region rich in acidic amino acids at the amino terminus, 2) several basic domains in the central portion of the protein, and 3) several poly-asparagine repeats near the carboxy terminus.
Although database searches revealed no overall similarities to any characterized proteins, the three structural motifs described are found in transcriptional regulatory proteins.
WO 95/35318 PCT/US95/07744 34 Stretches of acidic amino acids function in transcriptional activation presumably through binding to other proteins.
Basic domains serve as nuclear localization signals and can bind DNA. Poly asparagine repeats are present in the SWIl protein of yeast. This protein has been termed a transcriptional accessory protein because it is required for transcriptional activation of target genes but does not bind directly to DNA. It has been suggested that the poly asparagine repeats are involved in protein-protein interactions.
Sequencing genomic clones indicated that the EIN3 gene has a very simple structure. There are no introns within its open reading frame. However there is a single intron located in the 5′ transcribed region. In addition to sequencing the wild-type EIN3 gene, genes from three independently isolated ein3 mutants were sequenced. In each case an alteration was identified confirming the identification of the bona fide EIN3 gene. In the ein3-l allele, a point mutation introduces a premature in frame stop codon. The ein3-2 allele contains a T-DNA insertion which interupts the coding region. A point mutation in the ein3-3 allele substitutes an acidic amino acid for a basic amino acid within one of the basic regions described above.
The expression pattern of the EIN3 gene in seedlings was examined by placing the GUS reporter gene under control of the EIN3 promoter. The construct employed was a translational fusion including 5′ non-transcribed sequences, the 5′ intron and 93 amino acids of the EIN3 coding region cloned upstream of the GUS gene in the pBIl01 vector (Jefferson et al., EMBO J, 1987, 6, 3901-3907, incorporated herein by reference in its entirety) and named pHSEIN3GUS. Arabidopsis root explants were transformed and transgenic plants regenerated (Velvekins et al., PNAS 1988, 5536-5540, incororated herein by reference in its entirety). The GUS activity patterns observed suggest that the EIN3 promoter is most active in expanding or elongating cells. In three day old etiolated seedlings GUS activity
M-
WO 95/35318 PCT/US95/07744 35 staining is located predominantly in the apical hook and root tips. In younger seedlings in which the hypocotyl is not fully extended staining is also prevalent throughout this tissue. In 14 day old light grown seedlings abundant GUS activity is observed in the roots, upper portions of the hypocotyl, cotyledons and leaves. The EIN3 promoter is not induced by ethylene as the levels of GUS activity in air and ethylene treated seedlings appear equivalent. This observation is supported by the fact that steady state levels of the endogenous EIN3 transcript are similar in ethylene and air treated seedlings and adult plants as determined by Northern analysis.
The EIN3 coding region was cloned downstream of the bacterial reporter gene B glucuronidase (GUS) in the plasmid pRTL2-GUS according to the methods of Restrepo et al., Plant Cell 1990, 2, 987-998, incorporated herein by reference in its entirety, to create pNLEIN3Bgl2 (see Figure The plasmid was transformed into Arabidopsis protoplasts and transiently expressed according to the methods of Abel and Theologis, Plant J. 1994, 5, 421-427, incorporated herein by reference in its entirety. All detectable GUS activity was targeted to the nuclei of the protoplasts indicating that the EIN3 protein functions in the nucleus. These results suggest that the EIN3 protein may function as a transcription factor which regulates ethylene-regulated gene expression.
The EIN3 gene is a member of a small gene family.
Low stringency hybridization of genomic Southern blots indicates that there are at least two members in addition to EIN3. Three EIN3 homologue, designated as EIL1, EIL2, and EIL3, have been cloned and sequenced. The EIL and EIN3 predicted polypeptides structurally similar in that the amino termini of both proteins are rich in acidic amino acids and their central regions contain several basic domains. Their carboxyl termini are not as well conserved as EIL1 contains a polyglutamine repeat instead of poly asparagine repeats. The EIL2 and EIL3 polypeptides do not ,WO 95/35318 PCT/US95/07744 36 contain polyglutamine repeats or poly asparagine repeats.
It is interesting to note that the amino acid substitution in the ein3-3 allele occurs in one of the regions rich in basic amino acids that is completelty conserved between the EIN3 and EIL polypeptides. Currently, it is not known whether the EIL gene product functions in the ethylene signal transduction pathway of Arabidopsis. However at this time, the EIL1 and EIL2 cDNAs do not map to the same location as any of the characterized ethylene response mutations. The location of the EIL3 cDNA has not yet been mapped. The EIL1 polypeptide is the most similar to EIN3.
The ein3 mutant alleles were sequenced on an Applied Biosystems 373A DNA Sequencing System (Foster City, CA) using Taq dideoxy terminator chemistry (Applied Biosystems). The PCR primers are set forth in Table TABLE PRIMERS FOR EIN3 PCR SEQUENCE PRIMER SEQUENCE POSITION ID NO. NAME in genomic 57 PR24 CCTTCTATATTTGGTTCC 680-698 58 PR15 CCATTCTCCGGAATAATCC 1306-1324 59 PR5 CACGGAGCAGGATAAGGGTA 1148-1166 PR19 CGGATTGGATTGTGTGTGC 3312-3331 The primer sequences are set forth 5′ to 3′.
Primer pairs PR24 PR15 and PR5 PR19 were used to amplify genomic DNA from the ein3 mutants. PCR amplification was performed with a Biosycler Oven (New Haven, CT). Conditions for amplification were as follows: 920 C for 1 min; 550 C for 1 min.; 720 C for 3 min. The mutations discovered are listed in Table 6.
.WO 95/35318 PCT/US95/07744 37 Table 6 IDENTIFIED MUTATIONS OF EIN3 Allele Mutagen Sequence Consequences change of sequence change ein3-1 EMS G to A, amino acid position 1598 215, W to umber ein3-2 T-DNA rosition 2001 T-DNA insertion ein3-3 DEB G to T, amino acid position 1688 245, K to N The EIL genes were obtained by screening an Arabidopsis seedling cDNA library (Kieber et al., Cell, 1993, 72, 427-441, at low stringency in the following manner. The cDNA library was hybridized with the radiolabeled EIN3 cDNA insert at 420 C for 48 hours in a hybridization solution consisting of 30% formamide, Denhardt’s solution, 0.5% SDS, 5X SSPE, 0.1 mg/ml sheared salmon sperm DNA, according to the methods of Feinberg and Vogelstein, Anal. Biochem. 1984, 177, 266-267, incorporated herein by reference in its entirety. The filters were washed at 420 C with 30% formamide, 0.55 SDS (should this be 0.5% SDS?), 5X SSPE; followed by 2X SSPE.
EXAMPLE 4 HOOKLESS MUTATION OF THE APICAL HOOK The «triple response» in Arabidopsis thaliana occurs in response to the plant hormone ethylene and is characterized by three distinct changes in the morphology of etiolated seedlings. These include, exaggeration of the apical hook, radial swelling of the hypocotyl, and inhibition of root and hypocotyl elongation. Observation .WO 95/35318 PCT/US95/07744 38 of the apical hook was recorded by Charles Darwin as early as 1896.
The hook causes the apical portion of the seedling to become nearly parallel with the basal portion.
Production of the bend in the hypocotyl requires either a larger number of cells, or increased elongation of cells on the adaxial side (outside) of the hook. A study of the characteristics of hook formation in bean seedlings demonstrated that the curvature is produced by differential growth rates on each half of the hypocotyl resulting in longer cells on the convex side of the hook, see Rubenstein, 1972 Plant Physiology 49:640-643.
Previous studies suggest that hormones may be involved in hook formation. The hormones involved are believed to be auxin and ethylene. Auxin is known to be a controlling factor in cell elongation in the hypocotyl, see Klee and Estelle, 1991 Annual Review of Plant Physiology 42:529-551, incorporated herein by reference in its entirety, and ethylene has been shown to exaggerate the bending of the hook in wild type etiolated seedlings (Guzman and Ecker, supra). One hypothesis to explain hook formation is that auxin promotes elongation of cells on the outside of the apical hook allowing differential growth rates and bending. Work performed by McClure and Cuifoyle (1989) demonstrated that the initial uniform expression of small auxin up-RNA (SAUR) mRNA on both sides of the hypocotyl was altered when the tissue was transferred from an erect to horizontal position. An increase in SAUR mRNA accumulation was observed on the «outside» region and a concurrent rapid decrease in SAUR mRNA occurred on the «inside» region of an upward bending hypocotyl. Ethylene has been shown to alter transport of auxin in hypocotyl tissue (Mattoo and Suttle, supra), suggesting a possible role for ethylene in exaggeration of the hook. To exaggerate the hook, ethylene might affect auxin localization causing even more bending on the outside of the hook.
WO 95135318 W1 UT/S95/077144 39 The triple response of Arabidopsis has been used to isolate mutants affected in the ethylene response. The hookless l(hlsl) mutant exhibits a tissue specific defect in the triple response. Null mutants (hlsl-1) completely lack the apical hook in the presence and absence of ethylene while weak alleles of hlsl (hlsl-2) show some bending in the hook in the presence of ethylene. The complementation cross between hlsl-1 and hlsl-2 gave rise to Fl progeny which resembled hlsl-2. In addition to hlsl- 1 and hlsl-2, six EMS alleles, three DEB alleles, one X-ray allele, and two non-tagged T-DNA alleles have been isolated in accordance with the methods set forth in Guzman et al.
The Plant Cell 1990 2:513-523, hereby incorporated by reference in its entirety (Table Seven of these are strong alleles which are completely hookless in the presence of ethylene. Five of these are weak alleles showing a partial bend in the presence of ethylene. The hlsl phenotype is epistatic in the hook with other ethylene mutants.
WO 95/353 18 W0951131H t’TIS)5077,4 40 Table 7 IDENTIFIED PHIENOTYPIC AND PROTEIN MUTATIONS OF HLS1 ALLELE MUTAGEN HOOK ANGLE CHANGE hlSl-1 EMS 2.2 0.9 aa345 E to K hls1-2 T-DWk26.2 3.2 T-DNA insertion hsl-3 X-RAY 8.1 1.8 4.8kb deletion of promoter hls1-4 DEB ND (strong) aa345 E to K DEB 1.3 0.5 splice donor site mutated hls1-6 EMS 2.1 1.0 aa326 K to W hlsl-7 DEB 3.0 1.3 splice donor site mutated hls1-8 EMS 2.1 1.2 aal8O R to stop hlsl-9 EMS 6.3 1.5 aall R to stop EMS 23.2 3.0 aal M to I hlSI-12 T-DNA 3.0 1.2 ND hlsl -12 EMS ND (weak) NC hlsl-13 EMS ND (weak) NC hlsl-14 T-DNA ND (strong) JND ND not determined; NC no change in coding region or introns WO 95/35318 PC’T1U895/07744 41 Gene Structure and Analysis The HLS1 gene was cloned by left border rescue of a T-DNA inserted in the promoter of hlsl-2. The rescued fragment was used to isolate a LJkb genomic clone which was then used to isolate three cDNA clones. The T-DNA was found to have inserted 710bp upstream from the 5′ end of a 1.7kb cDNA clone. Deletions of the 1.7kb cDNA clone were generated in both directions using Exonuclease III. These clones were sequenced using Sequenase 2.0. Deletions of the genomic clone were also generated using Exonuclease III. These clones were also sequenced. The sequence of the genomic clone covered the entire 1.7kb cDNA as well as 1712bp upstream of the start of the cDNA and 313 bp at the 3′ end of the cDNA. This gene has two introns of 342 bp and 81bp in size. The cDNA encoded a 403 amino acid protein of about 43kDa.
Sequence Analysis of the Alleles The hisl gene from ten of the fourteen alleles was sequenced. The transcribed region as well as both introns were sequenced. The hlsl gene from each allele was isolated by PCR amplification. The sequences of the primers is set forth in Table 8.
.WO 95/35318 I’CT/US95/07744 42 Table 8 PRIMERS FOR HLS1 PCR SEQUENCE PRIMER SEQUENCE
POSITION
ID NO. NAME in genomic 61 1.1 cgccactgcatgtaagaac 1303-1321 62 11.2 tccacacgcttaatacggc 3229-3211 63 11.6 ggtacggagaagaaggag 2546-2563 64 III.1 cgcgggatattgattcggt 3071-3090 III.2 gtgttgaacacgcccacaa
ND
66 III.3 acgacaccacaaccacct 3479-3462 67 III.5 gacaagaagacacaaacc 3880-3863 68 pri gaatcggaggagaaggtc 3386-3403 Primer sequences are set forth 5′ to 3′.
PCR was performed on a Biosycler (New Haven, CT).
Conditions were 920 C, 1 min.; 550 C, 1 min.; 720 C, 3 min.
for 35 cycles. Some of the PCR products were subcloned and sequenced using Sequenase. Additional PCR products were sequenced directly using sequence specific primers and Tag sequencing on an ABI automated sequencer (Foster City, CA).
Alleles found to contain a sequence change from wild type were confirmed by direct sequencing of the PCR product along with a wild type control. The changes found in these alleles are listed below in Table 9.
-WO 95/353 18 O C’f/U7(95/07744 43 Table 9 AND PROTEIN MUTATIONS OF HLS1 IDENTIFIED GENOTYPIC ALLELE MUTAGEN SEQUENCE CONSEQUENCES CHANGE OF SEQUENCE
CHANGE
hlsl-1 EMS G to A aa345 E to K position 3487 DEB T to A splice donor position 2194 site mutated hlsl-7 DEB T to A splice donor position 2194 site mutated hlsl-6 EMS T to G aa326 K to W position 3431 hlsl-4 DEB G to A aa345 E to K position 3487 hlsl-9 EMS C to T aall R to position 2060 stop (CGA TGA) hlsl-8 EMS C to T aal80 R to position 2992 stop (CGA TGA) EMS G to A aal M(start) position 2033 to I Two alleles which showed no changes transcribed region or in the introns, hlsl-12 were both weak alleles. hlsl-12 was found to levels of transcript compared with wild type.
in the and hlsl-13, have reduced It is possible that there are sequence changes in the promoter region of hlsl-12 and hls2-13.
*WO 95/35318 PICTr/u95/077441 44 Spatial and Temporal Detection and Expression Northern analysis of the alleles revealed weak alleles hlsl-2, hlsl-3, hlsl-12 all show a reduction in the amount of transcript. The HLS1 transcript was found to be up regulated by ethylene.
HLS1 Homology Sequence comparison was done at the DNA as well as the amino acid level using Blast and TFASTA (GCG). Some homology to one class of acetyl transferases was found.
There are several classes of acetyl transferases with little homology between classes. The homology in one class of acetyl transferases is comprised of only a loose consensus. HLS1 is similar to a class of acetyl transferases found in bacteria and yeast and not similar to the class found in mammalian systems. Tercero, JBC 1992, 267, 20270, published a minimum consensus for one class of acetyl transferases. Other members of this class include yeast MAK3 gene, which acetylates a viral coat protein and perhaps some mitochondrial proteins. The rimL and rimJ proteins are also in this class of acetyl transfera.ses. These are E. coli proteins which acetylate ribosomal proteins L12 and L5. Also included in this class is the ARD1 protein of yeast. Mutants in this gene show a specific mating defect, an inability to sporulate, and loss of viability in stationary phase. There are several other bacterial members of this class. The other 150 amino acids of the HLS1 gene show no significant homology to any proteins in the database.
Various modifications of the invention in addition to those shown and described herein will be apparent to those skilled in the art from the foregoing description. Such modifications are also intended to fall within the scope of the appended claims.
,WO 95/35318 P’CT/I’CS95/0774 SEQUENCE LISTING GENERAL INFORMATION: APPLICANT: Trustees of The University of Pennsylvania (ii) TITLE OF INVENTION: Plant Genes for Sensitivity to Ethylene and Pathogens (iii) NUMBER OF SEQUENCES: 82 (iv) CORRESPONDENCE ADDRESS: ADDRESSEE: Woodcock, Washburn, Kurtz, Mackiewicz Norris STREET: One Liberty Place, 46th floor CITY: Philadelphia STATE: PA COUNTRY: USA ZIP: 19103 COMPUTER READABLE FORM: MEDIUM TYPE: Floppy disk COMPUTER: IBM PC compatible OPERATING SYSTEM: PC-DOS/MS-DOS SOFTWARE: PatentIn Release Version #1.25 (vi) CURRENT APPLICATION DATA: APPLICATION NUMBER: PCT/US95/07744 FILING DATE: 15-JUNE-1995
CLASSIFICATION:
(vii) PRIOR APPLICATION DATA: APPLICATION NUMBER:08/261,822 FILING DATE: June 17, 1994 (viii) ATTORNEY/AGENT INFORMATION: NAME: Beardell, Lori Y.
REGISTRATION NUMBER: 34,293 (ix) TELECOMMUNICATION INFORMATION: TELEPHONE: (215) 568-3100 TELEFAX: (215) 568-3439 INFORMATION FOR SEQ ID NO:1: SEQUENCE CHARACTERISTICS: LENGTH: 6042 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: TTCTCTCTCT CTCTTTGAAG GTGGCACGAG CACCCATAAC CTTCAGACCT ATAGATACAA ATATGTATGT ATACGTTTTT TATATATAAA TATTTTATAT AATTGATTTT TCGATCTTCT 120 TTTATCTCTC TCTTTCGATG GAACTGAGCT CTTTCTCTCT TTCCTCTTCT TTTCTCTCTC 180 SUBSTITUTE SHEET (RULE 26) ,WO 95/35318 PICI/JS95/07744 TATCTCTATC TCTCGTAGCT TGATAAGAGT TTTCTCTTT TGAAGATCCG -TTTCTCTCTC TCTCACTGAG ACTATTGTTG TTAGGTCAAC CTTTCAAAAA CCCTAATCCT CTGTTTTTTT
TTTCATGGGT
TTTTCCAGCT
TCTCCCTCTT
CTGGGTTTTG
TGTGGTTGAC
TTTTTGTTGC
TTTTTTTTGA
TAATACTACT
TCGAATGGTT
GTTGCTCTGT
TCCTTGAAGC
GAAACTTCAG
TAGAGTGGCT
CTTTGGAGGG
TCAACTGAGG
ATTATATTAT
GGGTTTATCC
ATTGATCCCG
GTGGCAATTA
ATAAGCGTTG
CAAACTTTTT
ATTCTACTAA
TGCAATGAAG
GCAATTCTGC
TTATTATGTA
GGGAGTTGCG
GGCCGCCATG
TTCTTTGCTG
TTCTCTATTA
TGGTATTACT
TGAATGGAGT
TTTTGTAGCT
TTTTATAATT
CATTATCTGG
TTTTAATTGC
CTTTTTTTTC
TTTTCGTCAG
GCTTATCTTC
GTATCTGAAG
TGGAGATATC
TAGGGCTTCC
AGCAAATAAG
CTATTATTAG
TTGTTTGCCT
AATCAGATTT
CTTTTAGTGA
CAGATAGATT
AGAGAATGGT
GGAAATGGGT
CTCTGCTTTT
TGACTGGTAA
AAAATAACAA
CCACCTTCTC
AATATGACAA
TCGACCTTAC
GTAAAATTTT
CATGCACTTA
GATGCGTTTT
TTCTTAATTT
TAAACAGGAA
TCTCTATGTA
GTTAACTCGG
TTTCCCTCGG
TCATCAGGTG
ACTTTCCAGA
GAGATTTAAG
CTTGAAGATC
CGTTACGATT
TTTTTGTTGC
ATTTTCACCA
CGTAGTGGTT
TCCATTTGGA
AAGAAGGTCT
GTAAAGAAAG
CGGTGATAGA
CGCGAGGGAA
ACCACGACTG
TAGGATGGAA
TCCTGCTCTA
TGCAAATATC
CAATTTTGCC
ACACTTGGCT
ACTGGGCTCT
TATTTTTCTA
GTGGACGTGC
CATGGTAGTT
GATTCCTCTG
ACCTTTTGTT
TATTTCCTGT
TTTTATTATG
AATGGTATGG
TCTGGCGTCT
TTAAATGGAG
TTGCGATCAT
TTTTATTTTG
CTTTTGCGCA
GATCGAATTC
CTTGGAGTTC
CTTTTTTCTT
TGAATGCGTA
CTTTTAGCTT
TGCTTCATAC
TAAAAAAAAA
AAGCATATGG
CCAACTTAGC
GGTCCTTAAC
ACTGTACAGA
AGAACCGACT
GGTGTTTTAG
TAGAGTTGAC
GCTGAAATTG
CTTCCTGTCC
GAAGGAGGTG
GCCATCTTAT
CAGGTAAACA
GTGGTTGTCT
ACATTTTAAT
ATGTTCTTGG
ACTTACAATT
ACTTGAGCTT
TGGGGTGGAG
TTTCGCCTCT
TAGTAAAATT
CAAATACAGT
TGCTGAGTCA
AGAGCGCATT
GGCGATTTCG
CTGGGGGGCT
AATGAGACTT
GTAGTTTCAG
TTCGGGATTG
TTTTACTACT
GATCATACGG
CAGTTTAGTT
TAAGATCAAT
ATTCAGGTCT
AAGTCTATGT
TGAATGTTGT
TTAACATCTG
GTTGTATAAC
GATTCGTTGT
ATCAAATCTG
CTTGAATCCT
TGAATGTGAG
TTTTGGTTTC
CTCGTTTCGG
GCCAATATGT
TTTTTCTGAT
TGTCACTTTC
GTTCTTTACT
GCATTCAGGC
CTTTGCTGTT
CTCTATTATA
TTATCCACTG
TTCCTTGTAG
TTGATTCCTC
ATCCATTTAC
GTCTGAGATC
CGCACTGATG
AAGGTGACTT
TTGTACGGAC
TCTGGGTTTT
CTTAGATCTC
TTTTCGGTTT
GTACTTGGTT
GATCTTTGCA
GAAATTTGTA
TATTGATTTG
GAAGCTGATT
TCTGCTCTTG
ATGATCTCTC
GTTACTAGAG
AAGTAAGCGT
TGTGTGTTAG
TGAATTTTAC
ACTCTGAGTA
ACCTCAGCTA
TGTCGGATAT
GTATGACTTG
TGCAGCTCGC
CTCTAAAGAG
TCAAAGTGGA
GGGACAGATC
GGAGTTCTCA
CTTAATTTTT
A.ACAGGTTGT
GAGTGTTTTT
TTACTTACAA
TGACTTGAGC
TCTGCAGGCC
CCACTCTCTA
GGTCTTCTTG
300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 SOBS= i WE E(OLiE #w ‘WO 95/353 18 I’MTUS95/077411
GCGCAAGCAI
TCTCTTTATP
CTCTATTACP
ATTTGTTCGC
ATGCAGCAGC
TGTCACTAAT
TTTTAGGGAA
ATCACAGGTA
TAGTCAAATT
GAAGATAGAA
TGCGCTTTAT
CCAGGTCTTG
GTCGAGACAA
AACGTTTTTG
CAGTGACTGG
CACTCTGCTT
GCTGAAATCT
TTTATCTTAT
AGACGAATCA
TACTAGCTCG
GAGCCCTCCA
TAAGGAAGAC
CAGTGATAAG
GGAGAAGATT
TTCATGGGAA
TGATGGTCCT
TTCACGGTTG
ATTTTGGGGA
ACTAGATCAG
TGGAAAAGAC
GATGACTTCA
GTTGTATGGA
GGGTGCATAT
CTCTAGCCTG
‘CGTCCCTCAC
LTGTATCTCTC
GGAAAGTACA
CATCTTTGGT
TAATGTGTTT
GGAGCAGGTT
AATGTTCAGA
TTTATGAGTC
ACCGCACTAG
ATACCCGCTT
TGTGTATGGA
GTGGC’AATGA
ATCATGGGTG
GGATTTCTGG
GCTGGTGGTT
GTATCGTCAT
GCGAGTAACA
CCATCTGTTC
ATAGTGCGGT
GTCTATGATT
GAGGAA.AGAG
TCTGATGTAA
GATCTGATTG
GTTAGCATGG
ACAGAAGAAG
CCTTCATTCC
CAAGGTTTGG
CATTTATATG
CTGTTTGGCA
ATTAGCAGTG
AGTTTATATG
TTACAAAGAG
GGTAACACCA
CGTGCTCCAT
AATTTTTATA TCCATTCTTA TTCTCTGTTA AGAAGCAATA
TCTTCGTCTG
GTCTTCAGCG
CACAGTACTG
TGTTCTGACG
AATCTCTCGT
CGCTCATTCC
CTTGGGCTTT
GGCTTCATCG
CATCTGGTGC
TGCTTCCTTG
TCCATAAAAT
GGTTGAATGT
TGAGATGGAA
GTGCATCCTT
GAGCGGAAGC
AAGAAGAGGA
TGGAAAGCAG
TGCCAGAGAA
AGTTGGATGT
AGGAACAGTC
TTGAAACAAA
AGAATAACAG
CTACCAAAGC
GCAGCTTAAG
GACGTGCTGC
ATTTTCATGG
CTGATCAAAA
GATATTGCAT
ATTCACTGAA
GTTCGTCACC
CTAATAATAA
CATCTTCAGA
ATGTCGACA
GACTGTCACT
GCCTTGTGGT
GTTTTATGTT
GATTATTAAT
AGTGGTCTTT
CGGTGGAGAG
TtGCTACAATC
AGACGGAATA
CTCGGTAATA
CCCTCAGGTT
TGTTTTTGTT
TACCGGTATG
ATGCCTGATA
TCAAATATGG
AATTGAAAGA
GGTAAAGGAT
CATTCTAATG
AAAGTACTCT
TGTATTGCAG
GATGGCGAAA
CAAGTTTATT
TGCTCCTACA
TGGGGAAGGG
CCGGAGACAC
GCAATTGGTT
GTCAGCCTCT
TCACCAACT
GCAGCAGAGG
GTCACCGTTG
TAATGCTTAC
GGGTTGGGAA
TTTTGCTGGG
ATTATACTAA
GAGCAGCTTG
TGTAAATTAT
ACTGACTTTT
CGTATTAGTC
TATCTTGTTC
TTGATGCTCT
GTCGTCCTGC
AGAATTCTTG
TACCAGTTAC
CCGCTTTTCC
GGCGAGTTCC
GTTGAGATGG
GGCACCTCGA
CTCTGGCTGG
AACATGGATG
ACAGAAACAA
CAGTTGGATA
ACGGATCAAG
ACCTCTCAAG
TCAACAGTGG
ATTGAACCAA
GAAAAGGATG
AGCAACTTTA
GGAAGTGGGA
TTATCTGCGA
GCTGAAGCCA
TCTATGAAAG
GCGAAGGGAA
ACACCGGGAA
GTCAACCGTA
GAATTGAGTG
CACCAACAAC
GTACCTTTTT
GCAGTGAACG
TGTCAAGACC
GTATTGATGA
CACGATGCCT
AATAATTCAT
TTGATTGTTG
TGTTCTTCTC
ATGACTTCCT
CAGTTGCTCC
TTATATTCAC
GCATTGCTTC
TCGCACTTAC
TATTTGGGAG
TTCAGTACAC
CAGCCACG3CC
CTCAAAATGC
GGAGGAACGA
CTACGTCTGT
AA.ATCCGTTC
TTAGTAGTCT
TTAATGAGGT
TGAGTCCTGT
TTGAAGGGGT
CTGTCGGATC
CTGGAAGCCT
TCCTTGATGA
GGGCAAAGAA
CAGATTCGTT
TGGATTCACA
GTATCGATTC
TGCAGATGTT
AGAGAAGATA
CAGCTACAGT
2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 SUBS dtilrES Mgle26 -WO 95/35i3,1 PCTAS10/7744
TCACGGATAC
ACAATCCCGT
ACAGCAACTT
TCCTGGGTTT
TTACGGTGTT
AAAATATAGT
ACCAAACAAC
TTATGGTCGG
ACCCTCGACT
GCCACAGAGT
TGGTGTAGCG
CAATATAGAC
CAGGCACTGT
CZ3ATGGAGTT
AGCTGAAGCT
TCCTAACTGT
GTGCATTCAC
GTACACTTAC
CACAGTTTTG
GCGTTCTCAA.
CACCAGAGAG
AAAGGCAAAT
ATCTCTTGTA
AAAGAGAATT
AGGTATGAAT
TTGAAGAAGA
AAGAATCAAA
GAAGAA.AAAA
TTGTAATGTT
CCGGTTTGTG
CAGATGAAGI
GGAGAGATCC
GCTTTAGCCT
GAGAATTTTG
CCATCTTCTG
AGCATGCCAG
AAGAGTGGAT
TTAAGCAATG
TATGATGACA
GCAACAACAG
GAGAGGAATG
AACAACGCTT
ATTCTAAAGC
GATGAAGAAC
CGAGAAATAA
GGAGATGGTT
CGTGTCCTTG
GTTCTCAACC
CAGTTTGT-A
AGCTGCGGAC
CGAGTCCGAC
GCACAACCGC
GAAAAGGCCG
TGGCTTCGGT
CAGGATGGAC
AGAACATTGT
ACTCTCGCTT
AATGGATTTT
ATGTTTATAT
TAATTCTATG
CATATGTAGA
CGACATCGAG
TGAAACAGAA
CTGGGTCTAG
GCAATACTGA
ATATCTCAGG
ACTGGGATCC
AATCATCGTT
TTTCTCAATC
GGACCGGATC
GTGCTGTTGG
CTTCTAATGT
TTATTAAACT
TGATTGACCG
ACCAGGTGGG
GCGTTTGGAG
ACTTGTCTCT
GCCTACAGGG
ATATTGACAT
ACCAATGACA
TTCAGCTAAC
AGTCACACTT
AACCGGTACA
TTCGAAGCGG
CCGGTTCAAG
GAGAAATCTC
TTGATTGCTC
TGTTGCTTCA
ACATATATCA
CGGAATCATA
CAATTTGGCP
ATCTATGGCG
GTCCCAGAAT
AAGCATATCG
TACTGTTGGC
ATTGTCTATG
GTCAAGTGGA
ATATTCTAAT
AAGAGGAGGC
GCTTTGGTCC
TGAGGAGCTC
TGATGCAGAG
TGAAGGATCC
GGTAGCTGCA
TCACATGGGG
AGCTGATTTG
CATGGAGAGT
TAACAAAAAC
GTATGGATTT
CCGTGCTTTT
GGAATGTTAC
CTTGATCTAA
GCTGCAGGTG
TATAASACGTC
AAAAAACGTG
ATGATCAAAG
CTCTGCTTCG
GAATTTTTCG
TCATCATAGG
AAGAAATCGT
AAAGAAAGGC
CTTGGTACAT
GGTCTAACCC
CGACAATCTG
GCAGCAGTAG
TCCGCAAGGA
GGAGGAGGGT
TTGGGGTCAC
TACAGAGATG
AGACAGCCCT
AGGAATAGAT
GCTAAGCTTC
GAGTGGTTGT
CGAGAGAAGT
GAGCCACTAA
ATTGTGAGCT
CGGCCTGAGC
CGCAGTAGTT
AAACAGGAGT
GCCTTCAGAT
CTCCGGCTGC
TCAAAGACGT
ATGTGGCTTT
GGTTATCGAA
ACTGCGTACG
TGACGTCGAG
TTAATTGTGT
CTCTTTTTTT
ACCATAGCTA
CG
TTGAAGCCTT
TGAGCTATAC
CTGGACCAGC
AAAGATCTTA
CCAATGAGAA
ACATGCATTT
ATGGTGCGTC
GGGTGGGAGT
CCTACAGTTT
TTGAGCAGTT
CGAATCCGAT
TTCAGTCGTT
TTGGACAAAG
TTATCTATGA
TTTCATCGGT
TTGGAGTTTG
TTTGGGGAAA
CATTGAAAAT
GATTGATCCG
TCCAGCGAGC
AAAACCGGCT
TGAAATGGCA
CCCAAAGGGG
TAAACCAGTA
GATCATTGGG
AGGGAAGCCG
ATTAAGAAAA
CTTAATTTGG
CAAACCGAAT
4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 4980 5040 5100 5160 5220 5280 5340 5400 5460 5520 5580 5640 5700 5760 5820 5880 5940 6000 6042 INFORMATION FOR SEQ ID NO:2: Wi SEQUENCE CHARACTERISTICS: LENGTH: 4747 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear SUBSTITUTE SHEET (RULE 26) 00O 95/3$3 18 POWVIU995I{Y77-4.
(ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: No (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:
CTTTTCTCTC
CGTTTCTCTC
CGAAGGTCTG
AGTCTATGTT
GAATGTTGTA
TAACATCTGG
TTGTATAACA
ATTCGTTGTT
TCAAATCTGT
TTGAATCCTA
GAATGTGAGA
TTTGGTTTCT
TCGTTTCGGG
CCAATATGTT
TGAAGAATAT
TCTGCTCGAC
GGAGTTATCC
CTCTTTCCTT
ACTTCTCTAT
AGTGTTAACT
CATCGTCCCT
TGATGTCGAC
CGGACTGTCA
TGGCCTTGTG
TCCGCTCATT
AGCTTGGGCT
TTGGCTTCAT
GACATCTGGT
GATGCTTCCT
TCTATCTCTA
TCTCTCACTG
AAGCTGATTT
CTGCTCTTGG
TGATCTCTCT
TTACTAGA&G
AGTAAGCGTT
GTGTGTTAGC
GAATTTTACT
CTCTGAGTAA
CCTCAGCTAG
GTCGGATATA
TATGACTTGG
GCAGCTCGCA
GACAAGTGGA
CTTACCATGG
ACTGGAGTGT
GAAAATGGTA
GTATCTGGgG
CGGTTAAATG
CACAATTTTT
AAGAGCAGCT
CTTGTAAATT
GTACTGACTT
CCAGTGGTCT
TTCGGTGGAG
CGTGCTACAA
GCAGACGGAA
TGCTCGGTAA
TCTCTCGTAG
AGACTATTGT
CGAATGGTTT
TTGCTCTGTT
CCTTGAAGCA
AAACTTCAGC
AGAGTGGCTT
TTTGGAGGGA
CAACTGAGGC
TTATATTATC
GGTTTATCCA
TTGATCCCGG
TGGCAATTAC
TAAGCGTTGT
CGTGCATGTT
TTGTGGGAGT
TTTTGGCCGC
TGGCAAATAC
TCTTGCTGAG
GAGAGAGCGC
ATATCCATTC
TGTGTCAAGA
ATGTATTGAT
TTCACGATGC
TTTTGATGCT
AGGTCGTCCT
TCAGAATTCT
TATACCAGTT
TACCGCTTTT
CTTGATAAGA
TGTTAGGTCA
GGAGATATCC
AGGGCTTCCT
GCAAATAAGA
TATTATTAGG
TGTTTGCCTC
ATCAGATTTC
TTTTAGTGAA
AGATAGATTT
GAGAATGGTT
GAAATGGGTT
TCTGCTTTTC
GACTGGTAAA
CTTGGGCATT
TGCGCATGCA
CATGGATGCG
AGTATCCATT
TCAGTCTGAG
ATTCGCACTG
TTATTTTGCT
CCATTTGTTC
GAATGCAGCA
CTTGTCACTA
CTTGTTCTTC
GCATGACTTC
TGCAGTTGCT
ACTTATATTC
CCGCATTGCT
GTTTCTCTCT
ACTTGCGATC
GTAGTGGTTA
CCATTTGGAC
AGAAGGTCTG
TAAAGAAAGA
GGTGATAGAA
GCGAGGGAAG
CCACGACTGT
AGGATGGAAG
CCTGCTCTAC
GCAAATATCG
AATTTTGCCG
CACTTGGCTC
CAGGCGGAGT
CTTAACCTT»
TTTTTATTTC
TACTCTGCAG
ATCCCACTCT
ATGGGTCTTC
GGGGAAAGTA
GCCATCTTTG
GCTAATGTGT
ATGGAGCAGG
TCTAGTCAAA
CTGAAGATAG
CCTGCGCTTT
ACCCAGGTCT
TCGTCGAGAC
TTTGAAGATC
ATGGCGATTT
AGCATATGGA
CAACTTAGCT
GTCCTTAACT
CTGTACAGAG
GAACCGACTG
GTGTTTTAGA
AGAGTTGACC
CTGAAATTGT
TTCCTGTCCT
AAGGAC»GTGC
CCATCTTATG
AGATCTGCAA
TCTCAGCAAT
TGTTTGGGGT
CTGTTTTCGC
GCCTGGTATT
CTATGAATGG
TTGGCGCAAG
CATCTTCGTC
GTGTCTTCAG
TTCACAGTAC
TATTTATGAG
TTACCGCACT
AAATACCCGC
ATTGTGTATG
TGGTGGCAAT
AAATCATGGG
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 SUBSTITUTE SHEET (RULE 261 .WO 95/338 VCIMS95/077PI TGTCCATAAA ATCCCTCAGG GGGGTTGAAT GTTGTTTTTG TTTGAGATGG AATACCGGTA ATGTGCATCC TTATGCCTGA CAGAGCGGAA GCTCAPAATAT TCAAGAAGAG GAAATTGAAA GTTGGAAAGC AGGGTAAAGG TTTGCCAGAG AACATTCTAA AGAGTTGGAT GTAAAGTACT AAAGGAACAG TCTGTATTGC TGTTGAAACA AAGATGGCGA GGAGAATAAC AGCAAGTTTA AGCTACCAAA GCTGCTCCTA CCGCAGCTTA AGTGGGGAAG GGGACGTGCT GCCCGGAGAC TGATTTTCAT GGGCAATTGG CACTGATCAA. AAGTCAGCCT
TGGATATTGC
TG-ATTCACTG
AGGTTCGTCA
CACTAATAAT
ATCATCTTCA
GTCATATGTA
CCCGACATCG
CTTGAAACAG
TGCTGGGTCT
TGGCAATACT
AGATATCTCA
ATACTGGGAT
TGAATCATCG
CATTTCTCAA
AGGGACCGGA
TGGTGCTGTT
TTCTTCTAAT
ATGTCACCAA
AAGCAGCAGA
CCGTCACCGT
AATAATGCTT
GAGGGTTGGG
GACAATTTGG
AGATrCTATGG
AAGTCCCAGA
AGAAGCATAT
GATACTGTTG
GGATTGTCTA
CCGTCAAGTG
TTATATrTCTA
TCAAUAGGAG
TCGCTTTGGT
GGTGAGGAGC
GTTGATGCAG
TTGGCGAGTT
TTGTTGAGAT
TGGGCACCTC
TACTCTGGCT
GGAACATGGA
GAACAGAAAC
ATCAGTTGGA
TGACGGATCA
CTACCTCTCA
AGTCAACAGT
AAATTGAACC
TTGAAAAGGA
CAAGCAACTT
GGGGAAGTGG
ACTTATCTGC
TTGCTGAAGC
CTTCTATGAA
CTGCGAAGGG
GGACACCGGG
TGGTCAACCG
ACGAATTGAG
AACACCAACA
CAAAAGAAAG
CGCTTGGTAC
ATGGTCTAAC
CGCGACAATC
GCGCAGCAGT
TGTCCGCAAG
GAGGAGGAGG
ATTTGGGGTC
GCTACAGAGA
CCAGACAGCC
TCAGGAATAG
AGGCTAAGCT
CCTCGCACTT
GGTATTTGGG
GATTCAGTAC
GGCAGCCACG
TGCTCAAAAT
AAGGAGGAAC
TACTACGTCT
AGAAATCCGT
AGTTAGTAGT
GGTTAATGAG
AATGAGTCCT
TGTTGAAGGG
TACTGTCGGA
GACTGGAAGC
GATCCTTGAT
CAGGGCAAAG
AGCAGATTCG
AATGGATTCA
AAGTATCGAT
TATGCAGATG
TGAGAGAAGA
ACCAGCTACA
GCTTGAAGCC
ATTGAGCTAT
CCCTGGACCA
TGAAAGATCT
AGCCAATGAG
GAACATGCAT
GTATGGTGCG
ACGGGTGGGA
TGCCTACAGT
CTTTGAGCAG
ATCGAATCCG
TCTTCAGTCG
ACAACGTTTT
AGCAGTGACT
ACCACTCTGC
CCGCTGAAAT
GCTTTATCTT
GAAGACGAAT
GTTACTAGCT
TCGAGCCCTC
CTTAAGGAAG
GTCAGTGATA
GTGGAGAAGA
GTTTCATGGG
TCTGATGGTC
CTTTCACGGT
GAATTTTGGG
AAACTAGATC
TTTGGAAAAG
CAGATGACTT
TCGTTGTATG
TTGGGTGCAT
TACTCTAGCC
GTTCACGGAT
TTACAATCCC
ACACAGCAAC
GCTCCTGGGT
TATTrACGGTG
AAAAAATATA
TTACCAAACA
TCTTATGGTC
GTACCCTCGA
TTGCCACAGA
TTTGGTGTAG
ATCAATATAG
TTCAGGCACT
TGGGATTTCT
GGGCTGGTGG
TTGTATCGTC
CTGCGAGTAA
ATCCATCTGT
CAATAGTGCG
CGGTCTATGA
CAGAGGAAAG
ACTCTGATGT
AGGATCTGAT
TTGTTAGCAT
AAACAGAAGA
CTCCTTCATT
TGCAAGGTTT
GACATTTATA
AGCTGTTTGG
ACATTAGCAG
CAAGTTTATA
GATTACAAAG
ATGGTAACAC
TGCGTGCTCC
ACCAGATGAA
GTGGAGAGAT
TTGCTTTAGC
TTGAGAATTT
TTCCATCTTC
GTAGCATGCC
ACAAGAGTGG
GGTTAAGCAA
CTTATGATGA
GTGCAACAAC
CGGAGJAGGAA
ACAACAACGC
GTATTCTAAA
1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 SUBSTITUTE SHEET (RULE 261
I
NO 95/353 J 8 .WO 9~I353Jt1 V 9.E /0I7744
GCTTATTAAA
ACTGATTGAC
AAACCAGGTG
TTGCGTTTGG
TGACTTGTCT
CCGCCTACAG
CTTTTGCCTT
GTTACCTCCG
TCTAATCAAA
AGGTGATGTG
ACGTCGGTTA
ACGTGACTGC
CAAAGTGACG
CTTCGTTAAT
TTTCGCTCTT
ATAGGACCAT
CTTGAAGGAT CCGAGTGGTT
CGGGTAGCTG
GGTCACATGG
AGAGCTGATT
CTCATGGAGA
GGAGTGATTG
CAGATTCCAG
GCTGCAAAAC
GACGTTGAAA
GCTTTCCCA
TCGAATAAAC
GTACGGATCA
TCGAGAGGGA
TGTGTATTAA
TTTTTCTTAA
AGCTACAAAC
CACGAGAGAA
GGGAGCCACT
TGATTGTGAG
GTCGGCCTGA
ATCCGGCGTT
CGAGCCACCA
CGGCTAAAGG
TGGCAATCTC
AGGGGAAAGA
CAGTAAGGTA
TTGGGTTGAA
AGCCGAAGAA
GAAAAGAAGA
TTTGGTTGTA
CGAATCCGGT
GTTTGGACAA
GTTTATCTAT
AATTTCATCG
CTTTGGAGTT
GCTTTGGGGA
CTCAAAGCTG
GAGAGCGAGT
CAAATGCACA
TTGTAGAAAA
GAATTTGGCT
TGAATCAGGA
GAAGAAGAAC
TCAPAACTCT
AAAAAAATGG
ATGTTATGTT
TTGTGTAATT
AGCGATGGAG
GAAGCTGAAG
GTTCCTAACT
TGGTGCATTC
AAGTACACTT
CGGACACCAA
CCGACTTCAG
ACCGCAGTCA
GGCCGAACCG
TCGGTTTCGA
TGGACCCGGT
ATTGTGAGAA
CGCTTTTGAT
ATTTTTGTTG
TATATACATA
CTATGCGGAA
TTGATGAAGA
CTCGAGAAAT
GTGGAGATGG
ACCGTGTCCT
ACGTTCTCAA.
TGACACCGTG
CTAACGGAAT
CACTTCTTGA
GTACAGCTGC
AGCGGTATAA
TCAAGAAAAA
ATCTCATGAT
TGCTCCTCTG,
CTTCAGAATT
TATCATCATC
TCATAAAGAA
3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4747
ATCGTCG
INFORMATION FOR SEQ ID NO:3: SEQUENCE CHRACTERISTICS: LENGTH: 1321 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETIC AL: NO (iv) ANTI-SENSE: NO (xi) Met SEQUENCE DESCRIPTION: SEQ ID NO:3: Giu Ala Glu Ile Val Asn Val Arg Pro 10 Gin Leu Gly Phe Ile Gin Gly Tyr Arg Met Val Pro Ala Leu Leu Pro Leu Leu Val Ser Ile Asp Pro Gly Lys Trp Val Ala 40 Gly Tyr Asp Leu Val Ala Ile Thr so Asn Ile Giu Gly Gly Phe Ala Arg Phe Ala Ala Ile Leu Leu Phe Leu Cys Gin Tyr Val Ala Ala Arg Ile Ser Val 70 Thr Gly Lys SUS1lnUEF(PH26 -Wuo 451IXI IN TICTA)SIM/077)7li 14 Leu Ala Gin Xle CyE Leu Gly Ile Gin Ala Asn Giu Giu Tyr Asp Lys Trp Thr Cys Met Phe 90 Val Ser Phe 145 Ser Gin Gly Pro Ser 225 Ile Asn Phe Ile Ala 305 Lys Ala Ile Pro Met 385 Thr Val Met Val Thr 130 Ala Ala Ser Glu His 210 Ser Phe Ala His Pro 290 Leu Ile Vai Tyr Cys 370 Gly Phe Phe Gly Gly 115 Gly Ser Gly Glu Ser 195 Asn Asp Gly Ala Asp 275 Val Ala Glu Ala 3Gl 355 Ser Val Leu ly rhr io Val Val Phe Leu Ile 180 Ala Phe Val Val Ala 260 Ala Val Trp Ile Pro 340 Leu Val His Gly Ser 420 Ser A .a Phe Leu Vai 165 Pro Phe Tyr Asp Phe 245 Asn Leu Phe Ala Pro 325 Ala Leu Ile Lys Phe 405 S er Ile Glu His Leu Glu 150 Leu Leu Ala lie Lys 230 Ser Va1 Ser Leu Phe 310 Ala Leu Ile Pro Ile 390 Leu %sp .ln Phe Ala Ala 135 Asn Leu Ser Leu His 215 Ser Gly Phe Leu Met 295 Gly Trp Tyr Phe Leu 375 Pro Gly Trp 2 Tyr Ser Leu 120 Ala Gly Leu Met Met 200 Ser Ser Leu His Met 280 Leu 31y Leu Dys rhr 350 ?he «In eu kla Chr Ala 105 Asn Met Met Tyr Asn 185 Gly Tyr Leu Ser Ser 265 Glu Leu Glu His Val 345 Gin Arg Vai C Asn 4 Gly C 425 Thr I Ile Let Asp Ala Vai 170 Gly Leu Phe Cys Leu 250 Thr 3ml Phe VTal Arg 330 rrp lal Ile fly Tal 10 fly .eu Leu Leu Ala Asn 155 Ser Val Leu Ala Gin 235 Val Gly Vai Phe Vai 315 Ala Thr Leu Ala Glu 395 Val Leu Leu Leu Phe Phe 140 Thr Gly Leu Gly Gly 220 Asp Asn Leu Phe Ser 300 Leu Thr Ser Val Ser 380 Phe Phe krg lal Asp Gly 125 Leu Va1 Val Thr Ala 205 Glu His Tyr Val Met 285 Ser His Ile Gly Ala I 365 Ser Leu Val Trp Ser Let 110 Val Phe Ser Leu Arg 190 Ser Ser Leu Val Val 270 Ser Gin Asp Arg A.la 350 Met krg kla Jai ksn 430 Ser I Thr Glu Pro Ile Leu 175 Leu Ile Thr Phe Leu 255 Leu Pro Ile Phe Ile 335 Asp Met I Gin Leu Glu 415 Thr C Cys I Met Leu Val Tyr 160 Ser Asn Vai Ser Ala 240 Met Thr Leu Thr Leu 320 Leu fly eu Ile Lhr L00 4et fly 1a ueM E sIMrr(RuLAZA) .W 0 9% /3.53 11 435 Ser Leu Cys 450 Leu Ile Lei Ser 465 Leu Arg Asp Glu Glu 545 Lys Val Lys .sn Giu 625 Asp Thr His His Phe 705 Gly I Met I Arg Ser I Asn 1 785 Asn Ser Arg Gin Asn 530 Arg Glu Asn Ile Ser 610 3iu 3iy ly Leu 31y 690 3iy .ys ksp Chr ?ro 170 rhr Arg Tyr Asn LeU 515 Ile Glu Asp Giu Glu 595 Lys Ala Pro Ser Ser 675 Gin Thr Asp Ser Pro 755 Ser Thr Ala Pro Glu 500 Asp Leu Leu Ser Vai 580 Pro Phe Thr Pro Leu 660 Ala Leu Asp Ile Gin 740 Gly Pro Asn Glu Ser 485 Asp Thr Met Asp Asp 565 Ser Met Ile Lys Ser 645 Ser Ile Vai Gin Ser 725 Met Ser Leu Asn Ala 470 Vai Glu Thr Thr Val 550 Vai Asp Ser Glu Ala 630 Phe Arg Leu Ala Lys 710 Ser Thr Ile Vai Asn 79n Trp 455 Gin Gin Ser Ser Asp 535 Lys Lys Lys Pro Lys 615 Ala Arg Leu Asp Glu.
695 Ser 3iy Ser Asp Asn 775 Asn 440 Leu lie Glu Ile Vai 520 Gin Tyr Glu Asp Vai 600 Asp Pro Ser Gin Glu 680 Ala Ala Tyr Ser Ser I 760 Arg Ala Ala Aia Thr Trp Glu Vai 505 Thr Glu Ser Gin Leu 585 Glu Vai Thr Leu 3iy 565 Phe Arg Ser -ys eu 745 eu 4et yr Asn Glu 490 Arg Ser Ile Thr Ser 570 Ile Lys Giu Ser Ser 650 Leu Trp Ala Ser Met 730 Tyr Tyr Gin Glu Met 475 Ile Leu Ser Arg Ser 555 Vai Vai Ile Gly Asn 635 Gly Gly Gly Lys Met 715 Ser Asp Gly Met Leu 795 Pro Leu Lys Ser Ala 460 Asp Giu Giu Vai Ser 540 Gin Leu Glu Vai Vai 620 Phe Glu Arg His Lys 700 Lys Pro Ser Leu Leu 780 Ser Ala Gin Arg Thr Ser Arg 510 Tyr Asp 525 Ser Pro Vai Ser Gin Ser Thr Lys 590 Ser Met 605 Ser Trp Thr Vai Gly Gly Ala Ala 670 Leu Tyr 685 Leu Asp Ala Asp Thr Aia Leu Lys 750 Gin Arg 765 Gly Ala Glu Arg Asn Glu 495 Vai Leu Pro Ser Thr 575 Met Glu Glu Gly Ser 655 Arg Asp Gin Ser Lys 735 Gin Gly Tyr Arg Ala 480 Thr Lys Pro clu Leu 560 Val Ala Asn Thr Ser 640 Gly Arg Phe Leu Phe 720 Gly Gin Ser Gly Tyr 6,- SUDTfErliNE(WAE2j ,WO 95/35318 WO 9535318PCTIIJS9SIO7744 54 Ser Ser Leu Arg Ala Pro Ser Ser Ser Glu Gly Trp Glu His Gin Gin 805 810 815 Pro Ala Thr Val His Gly Tyr Gin Met Lys Ser Tyr Val Asp Asn Leu 820 825 830 Ala Lys Glu Arg Leu Glu Ala Leu Gin Ser Arg Gly Glu Ile Pro Thr 835 840 845 Ser Arg Ser Met Ala Leu Gly Thr Leu Ser Tyr Thr Gin Gin Leu Ala 850 855 860 Leu Ala Leu Lys G’Ln Lys Ser Gin Asn Gly Leu Thr Pro Gly Pro Ala 865 870 875 880 Pro Gly Phe Giu Asn Phe Ala Gly Ser Arg Ser Ile Ser Arg Gln Ser 885 890 895 Giu Arg Ser Tyr Tyr Gly Val Pro Ser Ser Gly Asn Thr Asp Thr Vai 900 905 910 Gly Ala Ala Val Ala Asn Giu Lys Lys Tyr Ser Ser Met Pro Asp Ile 915 920 925 Ser Gly Leu Ser Met Ser Ala Arg Asn Met His Leu Pro Asn Asn Lys 930 935 940 Ser Gly Tyr Trp Asp Pro Ser Ser Giy Gly Gly Gly Tyr Gly Ala Ser 945 950 955 960 Tyr Gly Arg Leu Ser Asn Giu Ser Ser Leu Tyr Ser Asn Leu Gly Ser 965 970 975 Arg Val Gly Val Pro Ser Thr Tyr Asp Asp Ile Ser Gin Ser Arg Gly 980 985 990 Giy Tyr Arg Asp Ala Tyr Ser Leu Pro Gin Ser Ala Thr Thr Gly Thr 995 1000 1005 Gly Ser Leu Trp Ser Arg Gin Pro Phe Giu Gin Phe Gly Val Ala Giu 1010 1015 1020 Arg Asn Gly Ala Val Gly Giu Giu Leu Arg Asn Arg Ser Asn Pro Ile 1025 1030 1035 1040 Asn Ilie Asp Asn Asn Ala Ser Ser Asn Val Asp Ala Giu Ala Lys Leu 1045 1050 1055 Leu Gin Ser Phe Axrg His Cys Ile Leu Lys Leu Ile Lys Leu Giu Gly 1060 1065 1070 Ser Giu Trp Leu Phe Gly Gin Ser Asp Gly Val Asp Giu Giu Leu Ile 107 5 1080 1085 Asp Arg Val Ala Ala Arg Giu Lys Phe Ile Tyr Giu Ala Giu Ala Arg 1090 1095 1100 Giu Ile Asn Gin Vai Gly His Met Gly Glu Pro Leu Ile Ser Ser Val 1105 1110 il15 1120 Pro Asn Cys Gly Asp Gly Cys Val Trp Arg Ala Asp Leu Ile Val Ser 1125 1130 1135 Phe Gly Val Trp Cys Ile His Arg Val Leu Asp Leu Ser Leu Met Giu 1140 1145 11S0 Ser Arg Pro Giu Leu Trp Giy Lys Tyr Thr Tyr Val Leu Asn Arg Leu msUBSTITUE MA~rW% .W0 95/35318 .WO 5I331~IICT7IS95IO7744 1155 1160 1165 Gin Gly Val Ile Asp Pro Ala Phe Ser Lys Leu Arg Thr Pro Met Thr 1170 1175 1180 Pro Cys Phe Cys Leu Gin Ile Pro Ala Ser His Gin Arg Ala Ser Pro 1185 1190 1195 1200 Thr Ser Ala Asn Gly Met Leu Pro Pro Ala Ala Lys Pro Ala Lys «-Iy 1205 1210 1215 Lys Cys Thr Thr Ala Val Thr Leu Leu Asp Leu Ile Lys Asp Val Giu 1220 1225 1230 Met Ala Ile Ser Cys Arg Lys Gly Arg Thr Gly Thr Ala Ala Gly Asp 1235 1240 1245 Val Ala Phe Pro Lys Gly Lys Glu Asn Leu Ala Ser Val Ser Lys Arg 1250 1255 1260 Tyr Lys Arg Arg Leu Ser Asn Lys Pro Val Arg Tyr Giu Ser Gly Trp 1265 1270 1275 1280 Thr Arg Phe Lys Lys Lys Arg Asp Cys Val Arg Ile Ile Gly Leu Lys 1285 1290 1295 Lys Lys Asn Ile Val Arg Asn Leu Met Ile Lys Val Thr Ser Arg Gly 1300 1305 1310 Lys Pro Lys Asn Gin Asn Ser Arg Phe 1315 1320 INFORMATION FOR SEQ ID NO:4: SEQUENCE CHARACTERISTICS: LENGTH: 2310 base pairs TYPE: nucleic acid STRAND1EDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: TCTTCTTCTT CTTCCTCTTC CTCATCTCGT ATCTCTAACT GAAACTAGGG TTTATTATCT TCTCCTTCTT TTTCCCATCA TTTTTCTTCA TCATTTTTAT TCTCCTTCTT CTTCTGCTGT GATGTTTAAT GAGATGGGAA TGTGTGGAAA CATGGATTTC TGAAGTTGAT TTCTGTCCTG TTCCACAAGC TGAGCCTGAT TACTGATGAT GAGATTGATG TTGATGAATT GGAGAGGAGG GCTTAAACGT CTCAAGGAGC AGGATAAGGG TAAAGAAGGT GCAGTCTCAA GAGCAAGC’TA GGAGGAAGAA AATGTCTAGA
TTTGTCGAAG
CCATAGAAAA
TCATTTCTCC
TTCTCTTCTG,
TCCATTGTTG
ATGTGGAGAG
GTTGATGCTG
GCTCAAGATG
TTCTTTTGAT
GGCAGAGACC
AGGTTACAAT
GATCACTTGG
AAGATGACTA
ACAAAATGCG
CTAAACAGAG
GGATCTTGAA
USTIU[EmEEFu LE2) WO 95/353 10 ITCT/lUS95/07744
GTATATGTTG
GGAGAATGGG
GGTTAGGTTT
CCCGGGGATT
TCAAGACACG
GAGACGTTTT
TTGGTGGCCT
TGATTTGAAG
TGATATTGCT
TGCTAAAGAG
GCTTTATCCC
GAATGATTGC
AGAGCTCAAG
TGACTTTCCT
GCCAAACAGA
GTGTGCGCAC
ACTGGCATGT
TGTCAATGAA
CCAACCAATT
CATGTCCATG
AAGCGTGTCA
AAACATGGTG
CAACAGCAGC
GTTCGACACT
CGGCAACAGG
CAGAGATGAT
GCAAGATGTA
ATCTTTTGTG
CTATCGCTTG
TTGTCTCGGC
GGATGAACGA
AAGATGATGG
AAGCCTGTGA
GATCGTAATG
CATGAAGGTA
ACTCTTGGAT
CCTTTGGAGA
CAACTTGGTT
AAGGCGTGGA
AAGATCCGTA
AGTGCTACCT
GAGTCATGTC
AGTCAATACG
CCAGAAAAAG
GTCAAAGAAG
GATCTGAACA
AGCGAAATCA
CCACATCGAG
GTTAAGCCTG
GACTTAACGG
TACGACAGA,
CTGCTTCAAC
GAAGGAAGTT
AACAATCAAA
GCAGATCACA.
TTCCAGCTTG
ATGTCGATGC
TCCATATGGT
TTCTTACATT
TTATGATGTG
TTGGTGAATC
ACACTPJ.ATG
AAGTTTGTAA
CTGGTGCTTC
GTCCTGCGGC
ATAACCCGAT
CGCTTTTGTC
AAGGAGTTCC
TGCCTAAAGA
AAGTCGGCGT
AGCTCGTGAG
GGCTTGCTAT
CACCTCTTTC
ATGTTGAAGG
TTATGAATTC
AAGTCCCAGC
CTATTATGGA
GCCGGGGATT
ACAGTCGCTT
TAGTTGGATT
GTATAGTTCC
ATGTCCAGAG
CCACAGTCCA,
TCTTTGAAGA
CGTTTTTTCA
ACAACTTTGA
TGTTTGATTC
CAGGAGTAGT
TCTAAAGTCT
CACTCAACCA
TCTGTAAGAG
ICTCTGTCAT
TAAGTTTTCA
AGCTCAAGGC
TGATAATTTA
TATTACCAAG
TGGACCGACT
TGCGTTGATG
TCCTCCGCGG
TCAAGGTCCT
TTTGACTGCG
GCAATCTAAA
TATTAACCAA
TCTGTCTGGT
TTTCGAGAAG
TTCAAACTTT
AGGAAACTCG
CAGAACCGTT
TCTGGATAGG
ACCGTATGGA
TCCTCAGCCA
TGAAGATGGA
CAACCAAACC
TAACCATCAA
CTTGAACATC
AGGGAACAAC
AGCTGCACAT
CACACCGTTC
AGGAACGATG
TGGTAGTAGA
TGTAATATTT
TCTCTAAAAA
CATCAGCTTT
TTTGTTTATG
AGGGAGTGGT
TATCAAGCGG
CCTCATACCT
CAACACTGTG
TGGCCTAATG
GCACCTTACA
GTTATCAAGC
TGTTTGCAGG
GAAGAGTCCT
GGAAGTTGCT
GAGTCTCACT
GGGATGGTTG
GAATTCATGA
TTCACCTGCG
AATTCGAGAG
GCAGCACCAT
AGGCCAGTGA
CAGAAGATGA
TCTATGGTCA
GAACATCTCC
CCAAACAGAG
AACAACAACA
AACAACAACA.
GACATGGCGT
GATGGAATGC
TTTCATCTTC
TTTCCTGGGT
CTCTCTGTTA
GGATTATTCC
GGAAAGATAA
AGAATAATAT
TGCAAGAGCT
ATCCTCCTCA
GGAAAGAGGA
AGAAGCCTCA
ATATGTTTCC
ATAAGATGAC
TGGCTAGAGA
CGCTTCTGAT
ATGAAGTGGA
CTAAAATGCA
GAAAGAGAAA
AGAATCTTGG
ACAACCATCA
CCAGGTTTCA
ACTCAGTAGC
TCTCAGAGCT
TGGAAAATCA
AGTTCCCAGG
CAAACAACAA
ATGTGTTTAA
ATPACAGTAG
CATTCGATTA
AGCAGAAGCA
TCTTATTTTT
CTCTCTGTCT
CTGTGTGTCT
540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2310 TAGTTACACA CCCGACTTGG INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 3387 base pairs TYPE: nucleic acid ‘WO 95/353 18 I ‘C’1IUS95I077441 STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID
AGAGCAGTGA
TAATAAATGA
ATTATATATC
TCCTGATCTA
ACGTGATCTT
TCTAACTTTT
CCCATCACCA
CTGCTGTTCA
ATTGATACTG
TTTAGTTTAC
NTTTGGGTGA
TTTCTTATTC
TTTTGGAAAT
GATTAAAATA
CCCTTTTTTA
ATATAATTGT
TTAATGAGAT
TTGATTTCTG
ATGATGAGAT
AACGTCTCAA
CTCAAGAGCA
TGTTGAAGAT
ATGGGAAGCC
GGTTTGATCG
GGATTCATGA
ACACGACTCT
GTATTNCCAC
TGTCTTrAAAT
TANACATATA
CAGAGAGACT
TCGTCGACTT
GTCGAAGTTC
TAGAAAAGGC
TTTCTCCAGG
AAGATGATGA
TAGTGTTTAC
AGTTTTGTTT
ATATATGATC
TGAATCTGTT
CGATCTGATT
TAATTTAGGG
ACTGATTAGT
GGGAATGTGT
TCCTGTTCCA
TGATGTTGAT
GGAGCAGGAT
AGCTAGGAGG
GATGGAAGTT
TGTGACTGGT
TAATGGTCCT
AGGTAPATAAC
TGGATCGCTT
NAGCCGCTTT
TTTATGTGTA
TATATATATR
CCACAAAGAA
TTCTTCTTCT
TTTTGATGAA
AGAGACCTTT
TACTATACGC
TAGGTTTATT
ACGATCTAAT
ATTGTTTATA
CTTTCTATAT
GATAATTTTT
ATGTGTTTAT
TTTGATGATT
TTTGTTTGTG
GGAAACATGG
CAAGCTGAGC
GAATTGGAGA
AAGGGTAAAG
AAGA.AAATGT
TGTAAAGCTC
GCTTCTGATA
GCGGCTATTA
CCGATTGGAC
TTGTCTGCGT
GTTAATTACA
AGAAATGAAA
TAAATAGAGT
ACGCAAATAA
TCTTCTTCTT
ACTAGGGTTT
TTCTTCATCA
TTCTTCTTCT
CATAGGGTTT
TTCATGAGTT
AATCGTTGAT
TTGGTTCCTA
ATTATCCGAT
TACTTAAAAC
TTTTTTAGTA
TATTTGATTT
ATTTCTTCTC
-TGATTCCAT
GGAGGATGTG
AAGGTGTTGA
CTAGAGCTCA
AAGGCTTTGT
ATTTAAGGGA
CCAAGTATCA
CGACTCCTCA
TGATGCAACA
TATTAATTGT
TTAAAATGAT
ATATATACTA
ACAAAAGTCG
CCTCTTCCTC
ATTATCTTCT
TTTTTATTCT
ATTGATTTTT
TACTAGATCG
TATNCTACTT
CTATTTGAAA
TGTTGAAGAT
TGATTATTTA
TTTGATTGAA
AGTTGTTTGA
GTTACAGGTT
TTCTGGATCA
TGTTGAAGAT
GAGAGACAAA
TGCTGCTAAA
AGATGGGATC
TTATGGGATT
GTGGTGGAAA
AGCGGAGAAT
TACCTTGCAA
CTGTGATCCT
GTAATAATAA
ATATATGTAT
TGATCTATCT
CTTTCTAGCC
ATCTCGTATC
CCTTCTTTTT
CCTTCTTCTT
TAGGGTTATT
ATGGTTTTAC
TTAGTTTTTT
ATGTTTTCTC
CTCATCCTTT
GTTTAGGAGT
TTCGAAAAGC
TTCAGAAGAA
ACAATGATGT
CTTGGTGAAG
GACTATACTG
ATGCGGCTTA
CAGAGGCAGT
TTGAAGTATA
ATTCCGGAGA
GATAAGGTTA
AATATCCCGG
GAGCTTCAAG
CCTCAGAGAC
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 GTTTTCCTTT GGAGAAAGGA GTTCCTCCTC CGTGGTGGCC TAATGGGAAA GAGGATTGGT
SUBTOUISHE[(RLEZO
00O 95/35318 PCT1 4 1$951()774
GGCCTCAACT
TGAAGAAGGC
TTGCTP.AGAT
AAGAGAGTGC
ATCCCGAGTC
ATTGCAGTCA
TCAAGCCAGA
TTCCTGTCAA
ACAGAGATCT
CGCACAGCGA
CATGTCCACA
ATGAAGTTAA
CAATTGACTT
CCATGTACGA
TGTCACTGCT
TGGTGGAAGG
GCAGCAACAA
ACACTGCAGA
ACAGGTTCCA
ATGATATGTC
ATGTATCCAT
TTGTGTTCTT
GCTTGTTATG
TCGGCTTGGT
AACGAACACT
GTGTTTTGGT
CTCAAAGTTA
CTTATCGCTA
TCGAGTGATC
GATCTTAGAT
TGGTTTGCCT
GTGGAAAGTC
CCGTAAGCTC
TACCTGGCTT
ATGTCCACCT
ATACGATGTT
AAAAGTTATG
AGAAGAAGTC
GAACACTATT
AATCAGCCGG
TCGAGACAGT
GCCTGTAGTT
AACGGGTATA
CAGAAATGTC
TCAACCCACA
AAGTTTCTTT
TCAAACGTTT
TCACAACAAC
GCTTGTGTTT
GATGCCAGGA
ATGGTTCTAA
ACATTCACTC
ATGTGTCTGT
GAATCTCTCT
AAATGTAAGT
TGAGTTTGAC
GGGAATGGTG
GTGGCTCGCG
GAGCACACAC
GTAAGGNATT
AAAGATCAAG
GGCGTTTTGA
GTGAGGCAAT
GCTATTATTA
CTTTCTCTGT
GAAGGTTTCG
AATTCTTCAA
CCAGCAGGAA
ATGGACAGAA
GGATTTCTGG
CGCTTACCGT
GGATTTCCTC
GTTCCTGAAG
CAGAGCAACC
GTCCATAACC
GTCCTGCACC
CTGCGGTTAT
CTAAATGTTT
ACCAAGAAGA
CTGGTGGAAG
AGAAGGAGTC
ACTTTGGGAT
ACTCGGAATT
CCGTTTTCAC
ATAGGAATTC
ATGGAGCAGC
AGCCAAGGCC
ATGGACAGAA
AAACCTCTAT
ATCAAGAACA
TTACAAGAAG
CAAGCATATG
GCAGGATAAG
GTCCTTGGCT
TTGCTCGCTT
TCACTATGAA
GGTTGCTAAA
CATGAGAAAG
CTGCGAGAAT
GAGAGACAAC
ACCATCCAGG
AGTGAACTCA
GATGATCTCA
GGTCATGGAA
TCTCCAGTTC
CAGAGCAAAC
CAACAATGTG
CAAC.AATAAC
GGCGTCATTC
AATGCAGCAG
TCTTCTCTTA
TGGGTCTCTC
TGTTACTGTG
ACACACCCGA
TGNAAGCTCT
GTGTAATTCA
TGTATTTGAT
CAATGCTTGT
CAAGAAGAAA
CCTCATGATT
TTTCCTGATA
ATGAC TGCTA
AGAGAGCTTT
CTGATGAATG
GTGGAAGAGC
ATGCATGACT
AGAAAGCCAA
CTTGGGTGTG
CATCAACTGG
TTTCATGTCA
GTAGCCCAAC
GAGCTCATGT
AATCAAAGCG
CCAGGAAACA
AACAACAACA
TTTAAGTTCG
AGTAGCGGCA
GATTACAGAG
AAGCAGCAAG
TTTTTATCTT
TGTCTCTATC
TGTCTTTGTC
CTTGGGGATG
CTTCTTCTGT
CGCTAACTAC
GGATAACGTG
GTCTACGAGC
AACAAAATAA
1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3387 GAAGACTTGA ACATCCCAAA TTTCAAGGGA ACAACAACAA
TTTGAAGCTG
GATTCCACAC
GTAGTAGGAA
AGTCTTGGTA
AACCATGTAA
AAGAGTCTCT
GTCATCATCA
TTTCATAATA
TTTTACAATT
GGATAATTAT
GGTAGCATTT
AATCCAATCC
CTTAAAT
CACATAACAA
CGTTCGACAT
CGATGGATGG
GTAGATTTCA
TATTTTTTCC
AAAAACTCTC
GCTTTTAGTT
TAAATATATT
GAAAAGTTTG
TTATTACAAT
AAGCATGGGT
GAACACAAAA
INFORMATION FOR SEQ ID NO:6: SEQUENCE CHARACTERISTICS: LENGTH: 628 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear W. BSTTUTE SHIEET (RULE WO 95/353 18 PCT/US9s/07744 (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: Met Met Phe Asn Glu Met Gly Met Cys Gly Asn Met Asp Phe Phe Ser 1 5 10 Ser Pro Asp Leu Arg Asp Gin Gly Asp 145 Ile Thr Leu Gly Gin 225 His Lys Gly Asp Glu Lys Gin Gly Gly Ala 130 Arg Pro Leu Met Val 210 Leu Asp His Ser Ser Leu Glu Ser Ile Phe 115 Ser Asn Gly Gin Gln 195 Pro Gly Leu Met Leu Ile Glu Gin Gin Leu 100 Val Asp Gly Ile Glu 180 His Pro Leu Lys Phe 260 Gly SVal Arg Asp Glu Lys Tyr Asn Pro His 165 Leu Cys Pro Pro Lys 245 Pro Glu Glu Arg Lys 70 Gin Tyr Gly Leu Ala 150 Glu Gin Asp Trp Lys 230 Ala Asp SVal Asp Met 55 Gly Ala Met Ile Arg 135 Ala Gly Asp Pro Trp 215 Asp Trp Ile SAsp SAsp 40 Trp SLys SArg Leu Ile 120 Glu Ile Asn Thr Pro 200 Pro Gin Lys Ala Met 280 Phe 25 Tyr Arg Glu Arg Lys 105 Pro Trp Thr Asn Thr 185 Gin Asn Gly Val Lys 265 Thr Cys Thr Asp Gly Lys 90 Met Glu Trp Lys Pro 170 Leu Arg Gly Pro Gly 250 Ile Ala Pro SAsp Lys Val 75 Lys Met Asn Lys Tyr 155 Ile Gly Arg Lys Ala 235 Val Arg Lys Val Asr Met Asp Met Glu Gly Asp 140 Gin Gly Ser Phe 3lu 220 Pro Leu Lys 3lu SPro Glu SArg SAla SSer SVal SLys 125 Lys Ala Pro Leu Pro 205 Asp Tyr Thr Leu Ser 2 285 Gin Ile Leu Ala Arg Cys 110 Pro Val Glu Thr Leu 190 Leu rrp Lys Ala Jal 270 Ala Ala Asp Lys Lys Ala Lys Val Arg Asn Pro 175 Ser Glu Trp Lys Val 255 Arg Thr Glu Val Arg Gin Gin Ala Thr Phe Asn 160 His Ala Lys Pro Pro 240 Ile Gin Trp Ser Lys Cys Leu Gin Asp Lys 275 Leu Ala 290 Ile Ile Asn Gin Glu 295 Glu Ser Leu Ala Glu Leu Tyr Pro SUmBSTIm SHE rT(RuI) -WO 95/35318 *WO 9535318PCTJ/US95/07744 Glu Ser Cys Pro Pro Leu Ser Leu Ser Gly Gly Ser Cys Ser Leu Leu 305 310 315 320 Met Hi-Js Asn Val Asp 385 Gly Arg Tyr Val Asp 465 Leu Val His Phe Asn 545 Lys Asn Pro Gly Asn Tyr Phe Pro 370 Leu Cys Asp Gly Gly 450 Leu Met Met Gin Giu 530 Asn Phe Asn Phe Val 610 Tyr Leu Lys Giu 375 Asp Ile Ala Arg Arg 455 Pro Arg Val Phe Pro 535 Gin His Asn Phe Asp 615 Asp Lys Met 360 Phe Arg Ser Cys Phe 440 Pro Glu Asn S er Pro 520 Asn Gly Asn Arg Asp 600 Gly Val Pro 345 His Met Thr Arg Pro 425 His Val Asp Vai Leu 505 Gly Arg Asn Asn Phe 585 Tyr Met Phe Val Pro Arg 380 Thr Leu Asp Glu Val 460 Lys Asn Pro Val Asn 540 Asn Ala Val Asp Lys 620 Lys Asn 350 Lys Pro Glu Arg Arg 430 Lys Gin Ile Thr Val 510 Gly Asn Asn His Asp 590 Ser Gin Ser Ile Trp Phe INFORMATION FOR SEQ ID NO:7: SEQUENCE CHARACTERISTICS: LENGTH: 2234 base pairs aS UBS[N[SuEE. (RLE 26 VO 95/35318 W0 955318IT/US95107744 TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:
GGCCGCTTCA
CTTCCCATGT
TTCATCAAGA
GTTCTTTATT
TGGAAGAGAC
TTCCTCCACA
AGATGTCGAC
AATGCGTTTG
GAAACAGAGG
GATCTTGAAG
TATTATTCCT
GAAAGATAAG
GAATAATATT
GCTTCAGGAG
TGATCCACCG
TGGGAATGAA
TAAGAAGCCT
GCATATGTCG
GGATAAGATG
TGTGGCTCGG
AAGCGGGTCG
ACAACATGGT
AAGCTTTGGG
AAACTTAGAG
CAGATCAGCA
ATTTCAAGAC
TTTAGCGTAT
AACTCTACAA
GATCTTTAAC
CTTCCTTATC
TGCAACTCTT
ATGATGATGT
TCTCTCGATG
TACACCGATG
AAACGTCTCA
ACCCAGAAAC
AGACTTTTCT
TGTTTCTTTT
TCAGGTTAAA
TTAACGAGAT
TGTGTCCATT
ATGAGATGGA
AGGAGCAACA
CAGTCGCAAG AGCAAGCTAG TATATGTTGA AGATGATGGA
GAGAAGGGTA
GTTAGGTTTG
TCTGGAGGGA
CTTCAGGACA
CAGAGACGGT
GAGTGGTGGC
CATGATTTGA
CCGGATATTG
ACGGCGAAAG
GAGCTTTATC
CTTCTCATTA
TTCGATGTGG
GTTCGCTAAAA
TTCACGAGAA
GGTTACACTT
AGGAGTTCAA
GGAGCATCCA
AGCCTGTGAC
ATCGTAATGG
GTAATGATTG
CGACTCTTGG
TTCCTTTGGA
CTCAGCTTGG
AGAAAGCTTG
CGAAGATCCG
AGAGTGCTAC
CCGAGTCATG
ATGATTGTAG
AAGAGCGGAA
TGCAACATTT
AGAGGAAGCA
GTGAGAATGG
GGGACAACC.A
AGTTTCATAT
CACCACACAG
TCTTATTCTC
ATAAAACAAG
GAAATCGATA
GGGAATGTAT
ACCACAAGCT
TGAGCTTGAG
GAGTAAGTGT
GAGGAAGAAA
AGTTTGTAAA
TGGTGCTTCG
TCCAGCTGCT
TAACAGCTTG
TTCGCTTTTA
GAAAGGAGTT
TTTACCAAAT
GAAAGTCGGT
TAAGCTTGTG
TTGGCTTGCC
CCCTCCTCTT
CGAGTATGAC
ACCAGAGATA
TCCCATAAAG
GAACAATGAT
TCAGTGTCCT
ZCAGATGGTT
TAATTAATGT
CATCTCTGAA
AGAGAGATAC
GGCTCTGTTC
GGAA.ACATGG
GAACAAGAAC
CAGAGGATGT
AAAGGAGGCG
ATGTCTAGAG
GCTCAAGGCT
GATAATTTGA
ATTGCTAAGT
GTTGGTCCAA
TCGGCTTTGA
TCTCCACCTT
GAGCAAGGTC
GTTTTAACTG
AGGCAATCAA
PITTATTAACC
TCTTCTTCTT
GTTGAAGGTT
GTGATGATGC
GAGGAGGTCG
P.TGAATGTTA
CACAGCAAAA
TGTCCATATA
CTCTTTCTTT
GTTGTGGGGA
CACTTTTGGT
TTGATTGTGG
ATTTCTTCTC
CTGTAGTTGA
GGAGAGACAA
TCGATGGTTC
CCCAAGATGG
TTGTTTATGG
GGGAATGGTG
ATCAGTCAGA
CACCGCATAC
TGCAACATTG
GGTGGCCTAA
CTCCTCCTTA
CGGTGATCAA
AATGCTTGCA
AAGAAGAGGT
CATCATTAGG
TCGAGAAGGA
ATCCTCTAGC
CCACCACGGT
TGGTAATGGA
TGAATCTTGG
GAGACAATCG
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 GGGTGGAATG AAACTAGTAG TTCCTCAGCA SUDS nai~SE~rUL2 -WO 95/353,18 *WO 9535318PCIM/U95/07714
ACCAGTCCAA
CACCGAGCTT
GATGGAAAAC
CAACAGTGGC
CCAGATGGTG
GCAAACCGGA
ATGTATCAAT
CTCTGTTACC
TTGTTGATGA.
TCATTGTTTT
TTTGAGGCGG
CCGATCGACC
ATGGCCATGT
CAAAGCATGG
AATCAAATGT
TTTGATTCGA
GCAATGGAAG
ATGGTTCTGA
TACTTACCTG
TGATGAAGCC
AATAA!TGTCA
CCGC
TATCGGGCGT TGGAGTTCCG
ACGACAGAAA
TCATTGATGC
TTATGCAACA
CACCATTCGA
GAATGGGGAA
ATATTACACA
ACTTGGGTAT
ATCTATTTTT
CTATCCATTG
TGTCCAAAGC
AAAAGCAGCT
AGGGACGAAC
TATGGCAGCA
GCAGCAGCAG
ATCTCTGTAA
GTATTCTATT
TTTTTGTGTC
AACATCATTC
GAAAACGGGC
AACCAAACGC
CAGAATCAGC
AACGGGGTTA
TTCGATTACA
CAGCAGCAGC
TATTCATTCT
GCACCAAACA
TGAAAGTCAT
TCATGCTACA
AGAAGATGAT
CTCCTACTTT
AGCTGAATTT
A.CAATCGGTT
GAGATGATTG
AGCAGCAAAG
TTCATAATAA
CTCATCTATA
TTAACTCGCT
AGTTTGATTC
1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2234 INFORMATION FOR SEQ ID NO:8: SEQUENCE CHARACTERISTICS: LENGTH: 584 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: Met Met Met Phe Asn Glu Met Gly Met Ty: Gly Asn Met Asp Phe Phe Giu Gin Ser Ser Thr Pro Val Val Ser Leu Asp Val Cys Tyr Leu Pro Gin Glu Glu Asp Val Asm Glu Leu Asp Thr Asp Asp Met Asp Val Giu Lys Arg Trp Arg Asp Lys Arg Leu Lys Arg s0 Leu Lys Glu Gin Gin Cys Lys Giu Arg Gin Ser Glu Gin Ala Arg Val Asp Gly Ser Lys Lys Met Ser Arg Ala Met Glu Val Cys Lys 110 Lys Gly Lys Pro Val Gin Asp Gly Ala Gin Gly 115 Thr Gly Ala 130 Lys Tyr Met Leu 105 Ile Lys Met Pro Glu Val Tyr Gly Ser Asp Asn Arg Glu Trp Trp Asp Lys Val Arg SON=fW[ M..E (oilE3 WO 95/35318 PCT/tUS95/07714 Phe Asp Arg Asn Gly Pro Ala Ala Ile Ala 145 150 Asn Pro Ser Glu Trp 225 Lys Val Arg Thr Tyr 305 Gly Glu Val Phe Arg 385 Ser Asn Cys Met Asp 465 Glu Pro Ile His Ala Lys 210 Pro Pro Ile Gin Trp 290 Pro Ser Lys Met Pro 370 Lys Ala Leu Pro Gly 450 Leu Leu Thr Ser Thr Leu 195 Gly Gin His Lys Ser 275 Leu Glu Leu Glu Met 355 Ile Arg Gly Gly Tyr 435 Gly Ser Met Leu Gly Leu 180 Met Val Leu Asp His 260 Lys Ala Ser Leu Gin 340 His Lys Lys Tyr Phe 420 Arg Met Gly Ala Met Gly 165 Gin Gin Ser Gly Leu 245 Met Cys Ile Cys Ile 325 His Pro Glu Gin Thr’ 405 Gin Asp Lys Val Met 485 Glu Ser Glu His Pro Leu 230 Lys Ser Leu Ile Pro 310 Asn Gly Leu Glu Asn 390 Cys Asp Asn Leu Gly 470 Tyr Asn Asn Asp Leu Gin Cys Asp 200 Pro Trp 215 Pro Asn Lys Ala Pro Asp Gin Asp 280 Asn Gin 295 Pro Leu Asp Cys Phe Asp Ala Ser 360 Val Ala 375 Asn Asp Glu Asn Arg Ser Arg Leu 440 Val Val 455 Val Pro Asp Acg Gin Ser Cys Asp 185 Pro Trp Glu Trp Ile 265 Lys Glu Ser Ser Val 345 Phe Thr Met Gly Ser 425 Ala Pro Glu Asn Met Asn 170 Thr Pro Pro Gin Lys 250 Ala Met Glu Ser Glu 330 Glu Gly Thr Asn Gin 410 Arg Tyr Gin Asn Val 490 Val Lys 155 Ser Thr Gin Asn Gly 235 Val Lys Thr Val Ser 315 Tyr Glu Val Val Val 395 Cys Asp Gly Gin Gly 475 Gin Ile Leu Leu Arg Gly 220 Pro Gly Ile Ala Val 300 Ser Asp Arg Ala Asn 380 Met Pro Asn Ala Pro 460 Gin Ser Asp Val Gly Arg 205 Asn Pro Val Arg Lys 285 Ala Ser Val Lys Lys 365 Leu Val His His Ser 445 Val Lys Asn Ala Gly Ser 190 Phe Glu Pro Leu Lys 270 Glu Arg Leu Glu Pro 350 Met Glu Met Ser Gin 430 Lys Gin Met Gin Lys Tyr Gin Ser Glu Asn 160 Thr Leu Leu Trp Lys 240 Ala Val Ala Leu Ser 320 Phe Ile His Thr Arg 400 Met Val His Ile Thr 480 Pro Ala SUBSiwnr ESE (RiLE L ,WO 95/353 18 WO 951531$ CIV/US95/07744 Gin Asn Gin Gin Leu Asn Phe Asn Ser Gly Asn Gin Met Phe Met Gin 515 520 525 Gin Giy Thr Asn Asn Giy Val Asn Asn Arg Phe Gin Met Val Phe Asp 530 535 540 Ser Thr Pro Phe Asp Met Ala Ala Phe Asp Tyr Arg Asp Asp Trp Gin 545 550 555 560 Thr Gly Ala Met Glu Gly Met Gly Lys Gin Gin Gin Gin Gin Gin Gin 565 570 575 Gin Gin Asp Val Ser Ile Trp Phe 580 INFORMATION FOR.SEQ ID NO:9: SEQUENCE CHARACTERISTICS: LENGTH: 1722 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: No (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:
CAGATTCTAT
CGGCGCCTCC
ATCTGAGTAG
AGCAGCGTTT
TGAAGCAGCA
AAGATGGGAT
TTTATGGGAT
AATGGTGGAA
AAAGGGATAT
CACAGAAGTT
CTCACTGCAA
GGCCAACGGG
TTCCGCCACC
TTGGTGTAAT
CTAGAAGTTT
ACCGAGAAAA
GGATATGTAT
ATTTACAGAG
TGATGAGGAA
AAAGCGGCTC
ACATGATGAT
CTTGAAGTAC
TGTGTTAGAG
AGACAAAGTG
CAATCTTTCT
GCTTGAGCTT
CCCTCCTCAG
GAAAGAAGAT
TTACAAGAAG
CAGACATATG
GCAGGAGAAA
GGCTATTGTT
AACAACAATA
GGACATATGT
ATGGAAATAG
AAGGAAATGG
TTTCCAGAGC
ATGTCGAAGA
AATGGGAAAA
AGGTTTGATA
GATGGAAGTG
CAAGATACTA
AGGCGGTTTC
TGGTGGGATC
CCTCATGATC
GCTTCTGACA
ATGACGTCAA
GATCAAATAG
TAGGGATGTT
GTTCTGATTC
J43GAGCTTGA
CGAAGAACGG
ACTCTAGTAA
CAATGGAGCG
CGGTAGCGGG
GGAACGGCCC
ATTCAGGGTC
CTCTTGGAGC
CGTTGGAGAA
AACTGTCTTT
TCAAGAAGCT
TTAGCAACAT
GAGAAGGCGC
CCATGTCTAG
CCGGAGTTTA
GCATACGGCT
GAAGAAGATC
TCTAGGAACA
GAGAACCATG
ATATAAAGCT
ATCTTCTGAT
AGCTGCTATA
TGAGGTTGGG
TCTGTTATCG
AGGCGTGACA
ACCCGTTGAT
GTGGAAAATT
ACCCAATCTC
TTTATGGCTC
AGAAAACAAC
GTTTGTAGCT
TTGTGCGATG
TGGAGAGACA
AGATTGTTGT
TACAAGGCAC
C4AGGTTTTG
AATCTCCGTG
ATCAAGCACC
GATTCTACCG
GCTCTGTTTC
CCGCCATGGT
TTTCGAGGTG
GGTGTTTTGA
GTGAGACGGT
GCTGCTCTTT
AACACTTCTA
SmBUUEHE(UEB WO 95/353 18 pV109U95/07744 ACTTTCTTGT T%-CTGCAACC GGTGGAGACC CAGATGTTTT GTTTCCTGAA TCTACAGACT
ATGATGTTGA
ACAACTACAA
CAACACTCCT
TTGACAGGAA
ACCAACCAAC
GGATGCAGCA
GACCAAAAGC
CGACGCTGAA
CAGTAGGAAC
AGTAAAGAAA
GCTTATTCTC
GAGAAGATTA
ACTGATTGGT
CTGTGTTTAC
AACATGTGAG
CTTAAGAGAG
TAAACCCTAT
GCAGGTTCAG
TCCACAAAGA
TCAGAATCTT
AGAGAACAAT
GCTTCAGAGT
TCATTAAACA
GGTTTCATAA
GGCACTCATC
AAGAGAAAGT
AACAGTCTCT
AATCACCAAA
GGTATGACGG
AGCTTTCAAG
GGCAACGATG
GGTTTAGTCT
CTGCATAATC
TTTCTTTTTA
CAGTTTTTGA
GGACCAATCA
TTGAAGAAGA
GTCCTTATAG
TGACTTGTCC
GTTTAATGGT
ACCAGTTTAA
ACTTGGTTGA
TACCTACTGA
AAGGGCAAGA
TGTTTTC TAG
TCTCTCCATT
GCAGTATCCT
TTTTGGGATG
CCAACCACAT
TTATAAAGTC
TCCTTGTCCG
TCATCCCAAC
GGATTTGAAT
CTTCAATGGA
GTTGCCCACA
TCTTTATAGC
TCATAGCCCA
GAATTTGAAA
CCAATGCATC
ATGGGATTTC
ACTTCCTTCT
GATTATAACG
GATCTCTACA
CCTTCTCCTT
GGTGAGGAAA
TCTTGGATTC
TTTGTCTCTT
TGTAGCAATG
1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1722 T.AAGTTAATA ACCAAATTCA AA INFORMATION FOR SEQ ID NO:l0: SEQUENCE CHARACTERISTICS: LENGTH: 520 amino acids TYPE: amino acid STRA2NDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID Asp Ser Met Asp Met 1 5 Val Cys Ser Ser Ala Ser His Thr Ala Leu Ile Giu Glu Leu Giu Arg Leu Lys Gin Met Tyr Asn Asn Asn Ile Gly Met Phe Arg Ser Leu Pro Pro Phe Cys Asp Asp 40 Lys Lys Ile 55 Thr Gin 25 Len Ser Trp Arg Gly His Met Ser Asp Giu Asp Lys Gin Cys Ser Asp Giu Met Gin Arg Leu Lys Leu Leu Len Arg Thr Met 66 Ala 70 Lys Asn Gly Leu Lys Thr Arg Ser Lys Gin Gin His Asp Asp Phe Pro Gly Ile Leu Tyr Lys Ala Gin Asp 100 Giu His Lys Tyr 105 Tyr Gly Met Ser Lys Thr Met Gin 110 Ile Val Leu Giu Asn Gly 125 Arg Tyr Lys Ala Gin Gly Phe Val 115 120 SUMSIftSHEET (RWE ~t ‘WO 9535318 WPCTUS95I)7744 Lys Thr Val Ala Gly Ser Ser Asp Asn Leu Arg 130 135 Giu Trp Trp Lys Asp 140 Lys 145 Arg Asp Ala Phe Glu 225 Pro Gly Ile Ser Ile 305 Phe Ser Gin Lys Cys 385 Asp Thr Val Gin 2 Gin 1 465 Thr I Val Asp Ser Leu Pro 210 Asp Pro Val Pro Arg 290 Val Leu Thr 3Gi Phe 370 lu krg 3er ?ro ksp 150 rg ~eu Arg Phe Asp Ile Asn Leu 165 Thr Ala Gin 180 Leu Ser Ala 195 Leu Giu Lys Trv Trp Asp Pro Tyr Lys 245 Leu Ile Gly 260 Asn Leu Val 275 Glu Gly Ala Asp Gin Ile Val Pro Ala 325 Asp Tyr Asp 340 Tyr Pro Glu 355 Glu Glu Asp Asn Ser Leu Asn Leu Arg 405 Phe Tyr Gin 420 Cys Pro Asp 435 Gin Phe Asn Gly Asn Asp Asn Gln Asn Arg Asn Giy Pro 150 Ser Asp Gly Set Lys Leu Leu Leu Phe Pro 200 Gly Vai Thr 215 Gin Leu Ser 230 Lys Pro His Val Ile Arg Arg Arg Ser 280 Leu Trp Leu 295 Ala Met Ser 310 Thr Gly Gly Val Giu Leu Phe Giu Asn 360 Phe Gly Met 375 Cys Pro Tyr 390 Glu Asn His Pro Thr Lys Tyr Asn Giy I 440 His Pro Asn 2 455 Asp Leu Vai 470 Leu Gly Leu Glu 185 His Pro Leu Asp His 265 Arg Ala Arg Asp Ile 345 Asn Pro Ser 3Gm Pro 425 let ksp ilu lal Ala Asp 170 Leu Cys Pro Pro Leu 250 Met Ser Ala Glu Pro 330 Gly Tyr Met Gin Met 410 Tyr Gin Leu Asp I Leu I Ala 155 Ser Gin Asn Trp Vai 235 Lvs Ala Leu Leu Asn 315 Asp Gly Asn His Pro 395 rhr ly .in Cyr ~eu 175 !ro Ile Gly Asp Pro Trp 220 Asp Lys Ser Gin Tyr 300 Asn Val Thr Cys Pro 380 His Cys Met Gin Arg 460 Asn Thr lie Sex Thi Prc 205 Pro Phe Leu Asp Glu 285 Arg Asn Leu His Val 365 Thr Met Pro Thr Val 445 Pro Pro Asp Lys His Glu Val 175 Thr Leu 190 Gin Arg Thr Gly Arg Gly Trp lys 255 Ile Ser 270 Lys Met Glu Lys Thr Ser Phe Pro 335 Arg Thr 350 Tyr Lys Leu Leu Gly Phe Tyr Lys 415 Gly Leu I 430 Gin Ser Lys Ala Ser Pro Phe Asn Gin 160 Gly Gly Arg Lys Vai 240 Ile Asn Thr Ala Asn 320 Glu Asn Arg Thr Leu 400 Val Met Phe Pro Ser 180 3 ly sui NHEEF (RE’ 0V( 95/35318 C’U9/74 67 485 490 495 Gly Giu Glu Thr Val Gly Thr Glu Asn P-9n, Leu His Asn Gin Gly Gin 500 505 510 Glu Leu Pro Thr Ser Trp Ile Gin 515 520 INFORMATION FOR SEQ ID NO:ii: SEQUENCE CHARACTERISTICS: LENGTH: 2065 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:ii:
TTCCCCTGAG
CAGATTATCG
CCGTAGCAGA
AGATTGATGT
ATCGTGTC.AG
ACGAAGGGAG
CTCAAGATGG
TTTGTCTA.TG
AAGAGCTTGG
ATACGAAGAG
TCTCCAGGAT
TGATCCTCCT
GGGGAATG A
AA,\ACCTCAT
TATGTTACCT
CAAGATGACA
GATTCAGCAG
TAATAACGCT
GACAGAGGAA
AGAACAACCA,
AACGACAGGA
TTGTTAAAGG
CATCAGGATG
GAGTGATGAA
GCTTAAAAGA
ACACCTAAGA
TATCCTTAAG
GTATAATACC
TGGAAAGAGA.
GAGTGTTTAG
TTGCAAGATG
CAAAGGAAGT
GAATGGTGGG
GATCTCAAGA
GATATTGCAA
GCTAA.AGAGA
GAAAGAATAA
TTTTTGATTG
GAGAATGAGC
GAGATTGATG
ATCAAAGAGC
AAATCTCTGA
TACATTGTTG
GGAAAAGGGC
AAGTGAAGTT
CGTTTGGGAA
CTACTTTAGG
ATCCGTTGGA
TGAAACTCGG
AGATGTGGAA
AGATTAAGAG
GTGCGA.TTTG
AAACCCTAAA
ATTTTGTTTA
CTGATGATTT
CTGACGACCT
GACAAAAAGC
TCAAGCTCAG
AAGCTTATGG
AAGCCTGTGA.
TGATAAkGAAC
ATCTGATGGG
GTCTTTGTTA
GAAAGGGACG
TCTGCCTAA.A
GGTTGGAGTT
GCATGTTCGT
GTTGGCGGTT
CTCCA.ATGTG
CA.ACAGTGAC
TAAAGACAGT
AAGAGATCAA
TTTCTTTAAT
AATGGGCGAT
AGCTAGTGAT
TGAGAGACGG
TGGCTCTCAA
AGGAAGAA-\
AAGTCTGCAA
GTTCGCTCCT
GGTCC, GCTG AtATAGGAATT
TCTTCTTTGA
CCTCCGCCTT
AGCCAGAGTC
TTAACGGCAG
CA.GTCGAAAT
TTGAACCAAG
ACTGAGACAC
AGTGACTATG
AGAAGAAATC
GATAAAGCAG
TTCGGCGCTT
CTTGCTATGT
AATGTTGCTG
ATGTGGAAAG
GGAGCTCAAA
TGTCTTAGAG
AGTTCGCGGG
CTGACAA’.A»T
CTATTGCTAA
CACAGTTTGT
TGCAACATTG
GGTGGCCAAC
CTCCTTACCG
TGATCAATCA
GTTTACAGGA
AGGAATCTTT
ATCGTAGGGG
ATGTTGATGG
AGATTCAAAA
AGAAACATCG
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 CCTAGCAGTG ACAATGGAA-A GACAGGAGGA AACCTGTGGT GCTTCAGGTT CAGTTTCATC ACAGCCAMCT CACATTCAGT WMlrUMLN~rRIE 26)mu
._J
-WO 95/35318 WO 9/3538 T/IUS95/07744
CAGAAGGAAA
TGAAGCTCAA
ATATAACATC
ACCAGAGGAT
TTCCACTTGT
CCAAACCCTA
GTGTTTGTAC
GCACCACCTC
TTACTCGGAA
CTGTCTCCAT
TTTGGAGCTT
AATAGTCTGG
TCTTTTTTTC
AGTTGCCAGA
CATCATTTGT
AGACCTCGAA
CAAAGAAACA
AACGGTACTC
AATGGTCTGG
TAATGAACAA
ACCAAGAGCT
ATAACCAGGA
ACAACAGTGG
ATGAAGACGG
TGACTGACTT
AGTGTCTTGC
CTAGTACAGT
TTATAATATT
AGAACTGCTT
AAATTTACCT
TTAGATCCGG
TCTTACCTGA
ATCAAGAGGA
AACTAGTGGT
ACTATGATGC
TCAATTTGGG
AGACGACATT
GTTCGAGGAG
TGTAACAGGG
GGACTTTGAC
CATTTTTTTT
TACTTTCTCT
TTCTTAGATT
TGCCCGTTGT
GTGAG
AACTGTCAAT
TATGAATCAT
CGATGTTGTC
T.-CTGAGTTC
CTGTAGACGA
TCAGGGTACA
CTCCATACAC
GCCCCAGGAG
AGTGAGTTGC
TATGGTGGTT
GGGAGATTAC
TCTTCATTTC
TGTTAAGAGA
AATGGTCTCT
CGACAAGAGG
GTTGATGCCC
GACCCAAATA
AATAACCAAA
AAGGCCAATG
ACTTCTACAA
AGATAGAAAT
GAGTACTTCA
CTCAGTATCA
TTGGTGATGA
ATAGTTCAAA
TTCTGATCTT
AACAATTTTC
AGGGAAAGCA
AAGAACAACC
CTCTGCTAGA
TTGCCTTAGG
CATACTTATC
CTTTATGGAC
TCCCTCTGCA
GAATACACAA
ACCCCTTGGT
GAGTGGCATT
TTTCTCATGG
AGGACATGGC
ATATTCTTCC
CTTTTGAATA
GTTAGCGTAT
1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2065 INFORMATION FOR SEQ ID NO:12: SEQUENCE CHARACTERISTICS: LENGTH: 567 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: Met 1 Gly Asp Leu Ala’Met Ser Val Ala Ile Arg Met Glu Asn Glu is Pro Asp Asp Glu Glu Ile Val Arg Leu Leu Ala Ser Asp Asn Ala Glu Ile Asp Asp Ala Asp Asp Glu Arg Arg Met Trp Gly Val Ser Asp Lys Asp Arg Ser Gln Gly Lys Arg Ile Glu Arg Gln Lys Ala Gln Thr Ly8 Glu Thr Pro Lys Lys Ile Asp Gln Ala Gln Lys Lys Met Ser Ala Gln Asp Gly Leu Lys Tyr Met Leu Lys Ile Pro Leu Met Glu Val Cys Lys Val Arg Gly Phe Val. Tyr Gly Ile vSMEE[ wrN) WO 95/35318 PCTIUS95/07744 Glu Lys Trp Lys 130 Lys Tyr 145 Asn Ser Leu Leu Pro Leu Glu Trp 210 Arg Lys 225 Ala Val Val Arg Ala Ile Pro Ser 290 Gly Asn 305 Tyr Asp Asp Ser His Ser Arg Pro 370 Pro Glu 385 Ala Pro Val Val Leu Val Asn Glu 450 Gly 115 Glu Glu Gin Ser Glu 195 Trp Pro Ile Gin Trp 275 Ser Asn Val Arg Val 355 Arg Ala Leu Asp Val 435 Gin Lys Lys Glu Phe Ser 180 Lys Val Hit Asn Ser 260 Leu Asp Ala Asp Arg 340 Arg Iik Gin Leu Pro 420 Pro Thr Pro Val Glu Val 165 Leu Gly Lys Asp His 245 Lys Ala Asn Asp Gly 325 Asn Asp Arg Gin Glu 405 Asn Glu Met Val Lys Cys 150 Leu Met Thr Leu Leu 230 Met Cys Val Gly Arg 310 Thr Gin Gin Ser Arg 390 Tyr Ile ;he Met Ser Phe 135 Leu Gin Gin Pro Gly 215 Lys Leu Leu Leu Asn 295 Arg Glu Ile Asp Gly 375 Asn Asn Ala Asn Pro 455 Gly 120 Asp Ala Asp His Pro 200 Leu Lys Pro Gin Asn 280 Ser Lys Glu Gin Lys 360 Thr Ile Ile Leu Asn 440 Val Ser Lys Phe Leu Cys 185 Pro Pro Met Asp Asp 265 Gin Asn Pro Ala Lys 345 Ala Val Leu Asn Gly 425 Asn Asp Ser Asn Gly Gin 170 Asp Trp Lys Trp Ile 250 Lys Glu Val Val Ser 330 Glu Glu Asn Pro Gly 410 Pro Tyr Glu Asp Gly Lys 155 Asp Pro Trp Ser Lys 235 Ala Met Glu Thr Val 315 Gly Gin Lys Arg Asp 395 Thr Glu Thr Arg Asn Pro 140 Ser Al-, Pro Pro Gin 220 Val Lys Thr Ser Glu 300 Asn Ser Pro His Gin 380 Met His Asp Tyr Pro 460 Ile 125 Ala Asp Thr Gin Thr 205 Ser Gly Ile Ala Leu 285 Thr Ser Val Thr Arg 365 Glu Asn Gin Asn Leu 445 Met Arg Ala Gly Leu Arg 190 Gly Pro Val Lys Lys 270 Ile His Asp Ser Ala 350 Arg Glu His Glu Gly 430 Pro Leu Ala Ile Asn Gly 175 Lys Asn Pro Leu Arg 255 Glu Gln Arg Ser Ser 335 Ile Arg Glu Val Asp 415 Leu Leu Tyr Trp Ala Arg 160 Ser Tyr Glu Tyr Thr 240 His Ser Gin Arg Asp 320 Lys Ser Lys Gin Asp 400 Asp Glu Val Gly aSmiUESH (RULE2) 00O 951353318 VCTIUS96/07744 Pro Asn Pro Asn Gin Giu Leu Gin Phe Gly Ser Giy Tyr Asn Phe Tyr 465 470 475 480 Asn Pro Ser Ala Val Phe Val His Asn Gin Giu Asp Asp Ile Leu His 485 490 495 Thr Gin Ile Giu Met Asn Thr Gin Ala Pro Pro His Asn Ser Gly Phe 500 505 510 Giu Giu Ala Pro Gly Gly Val Leu Gin Pro Leu Gly Leu Leu Gly Asn 515 520 525 Giu Asp Gly Vai Thr Gly Ser Giu Leu Pro Gln Tyr Gin Ser Gly Ile 530 535 540 Leu Ser Pro Leu Thr Asp Leu Asp Phe Asp Tyr Gly Gly Phe Gly Asp 545 950 555 560 Asp Phe Ser Trp Phe Gly Ala 565 INFORMATION FOR SEQ ID NO:13: SEQUENCE CHARACTERISTICS: LENGTH: 240 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: iinear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: Met Thr Val Val Arg Glu Tyr Asp Pro Thr Arg Asp Leu Val Gly Val 1 5 10 Giu Asp Val Glu Arg Arg Cys Giu Val Gly Pro Ser Gly Lys Leu Ser 25 Leu Phe Thr Asp Leu Leu Gly Asp Pro Ile Cys Arg Ile Arg His Ser 40 Pro Ser Tyr Leu Met Leu Val Ala Giu Met Gly Thr Giu Xaa Xaa Xaa 55 Lys Lys Giu Ile Val Gly Met Ile Arg Gly Cys Ile Lys Thr Val Thr 70 75 Cys Gly Gin Lys Leu Asp Leu Asn His Lys Xaa Xaa Xaa Ser Gin Asn 90 Asp Val Val Xaa Xaa Lys Pro Leu Tyr Thr Lys Leu Xaa Xaa Xaa Xaa 100 105 110 Xaa Xaa Xaa Xa4 Xaa Xaa Xaa Xaa Ala Tyr Val Leu Gly Leu Arg Val 115 120 125 Ser Pro Phe His Arg Arg Gin Giy Ile Giy Phe Lys Leu Val Lys Met 130 135 140 SUB~mESKEYANNZS WO 95/35318 WO 9/353Hi CT17tS95/07744 Met Giu Glu 145 Ala Thr Giu Phe Thr Gly Val Asn Pro 195 Val Ile Lys 210 Arg Ile Arg 225 Trp Phe Arg 150 Gin Xaa Asn Gly Ala 155 Asn Lys 180 Val Asp Xaa Xaa Xaa Xaa Asn 165 170 Cys Gly Tyr Ser Giu Phe 185 Tyr Ala His Arg Val Asn 200 Gin Arg Val Glu Tyr Ser Tyr Ile 160 Ala Ser Val Asn Leu 175 Thr Pro Ser Ile Leu 190 Ser Arg Arg Val Thr Leu Glu Pro Val Asp Ala Giu 215 205 Thr Xaa Xaa 220 Xaa Leu Tyr Phe Ser Thr Thr Glu Phe Phe Xaa 230 235 Xaa Xaa Xaa Xaa INFORMATION FOR SEQ ID NO:i4: SEQUENCE CHARACTERISTICS: LENGTH: 1702 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:
CTCCAACTTT
GTCTATTTCT
ACTTATAC ‘T
AACAAATCTT
TATCTACTTA
CACTCTGAAA
TCGGCGTGGA
TCACCGACCT
TGGTGGCTGA
AAACCGTTAC
AGCCTCTTTA
GACAAGGGAT
CTGAGTATTC
GGAAATGTGG
ATCGAGTTAA
TAAAACTCAT
CTTTCCTTTA
CACGTTATAC
CCTCACTTCT
ACTCTTCTTC
GAACCAAAAC
GGACGTGGAA
TTTGGGTGAC
GATGGGTACG
ATGTGGCCAA
CACTAAACTC
TGGGTTTAAG
GTATATTGCA
TTATTCGGAG
TGTTTCGCGG
CATAAATAGT
AAATCCAAAT
ATATATATAG
CTCATTTCCA
TAACTCTAAT
ATGACGGTGG
CGACGGTGTG
CCGATTTGTA
GAGAAGAAGG
AAACTCGATT
GCTTACGTCT
CTCGTGAAGA
ACTGAGAACG
TTTCGTACAC
CGAGTCACGG
AAAAAAGTAG
CCTATAAACT
AGTTTCTATA
CACTCACCTT
CTCTCTCTCT
TTAGAGAGTA
AAGTCGGACC
GAATCCGACA
AGATAGTGGG
TAAATCACAA
TGGGCCTTCG
TGATGGAGGA
ATAATCAAGC
CGTCGATTTT
TTATCAAGTT
CCGGAAAAAT
CATAGCTTTC
AATGCTTCTC
CCTCTCTATA
ATTTACTCTG
CGACCCGACC
AAGCGGCAAG
TTCACCTTCC
CATGATTAGA
ATCTCAAAAC
CGTCTCTCCT
ATGGTTTAGA
TTCTGTGAAT
GGTTAACCCG
AGAGCCGGTT
AAAATAAAAA
TCTGTTCTTT
TTTCCTCTCG
TATTAAACCC
CTTCTGTTCT
CGAGACTTAG
CTTTCTCTTT
TATCTCATGC
GGATGTATCA
GATGTCGTTA
TTTCACAGGA
CAAAACGGAG
TTGTTCACCG
GTTTACGCTC
GATGCTGAGA
MEM1TSHE~(UKE26) ,W0 95/35318 *WO 9/35311070/S95/077-4I
CGTTGTACCG
TTAATAACAA
CCGGGTCTGG
CCGTATTAAG
GATTGAGACG
AACTACCTTC
GAGGAGAAGG
TGGCTAAGGC
GGCGAGGAAT
GGCTTGGAGA
TTTCCATTTT
TTCTCTATTA
GTTTTAGAAT
TTTTTGTGGG
AATCCGGTTT AGCACAACAG ACTCTCGCTT GGGACTTTCG ATCATGGCCC GGTTCGGCTA CGTGTGGAAT TGTAAAGACT TGTGGTGGCT AAAACGACGC GATACCGTCC GTTTTCGAAC TCCACGCGCG GTGAAGATGG AGGTGGTTGT GGTGTCGTGG ACCACATTGG AAAGTGCTAT TGACTATAGT GATGGTGTTG TGTAGACCCT AGAGAATTTT ACCACTTGAT GTTAAATTAG TAATCTTTTT TTTAGGTAAC TGTTATAAAT TA AG3TTTTTCCC
TCGCGGTGCC
AATTCCTCGA
CGTTTCTGTT
GAGTAGTTGA
CTTTTGGACT
TGAAATCCTT
CGGCGGAAGT
CGTGTGACGA
TTGGTGATTG
AAAACTTTTT
GGGTTTTCTT
TTTTTTTGCT
GCGGGATATT
ACGTGGAAGC
ATATCCACCC
AGAAGTACGT
TAAAACGTTG
TCATTTTATG
GTGTGCTCAC
TGCCGGAGAA
GGATCTTTGG
GACTAAATCG
TTTTAACTCT
CTAAGTTTAT
TT’TTGTTTTG
GATTCGGTAC
TGTTATGGAT
GAGTCATGGG
GGAGCGTCGA
CCGTTTCTGA
TATGGAATCG
GCGCATAACT
GACCCGTTGC
TGTATAAAGC
CCACCTGGCG
ATAATATATA
AGATTTTCTT
TTTTGTTTTG
960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1702 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 4146 base pairs TYPE:.nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID
TGTCATAATC
GGGGCCACGT
TGATGGTGTT
TATGAAAATT
TTATTGTTTC
AATTCGAATT
GTTGATATTT
CACCTAATGA
CATATAATCA
GGAATACGAT
TTCTTTTTAG
AGTACAAAAT
CAAGTGTGCC
TCTTGCTTAG
GTTCTTCGAG
CAAATTACAC
TCGTCTACAA
CGGCTTTTGA
AGTGTCATGT
TTTGTCATAT
AGACATGACC
CCTCGTAGTG
AAATCACCTA
GTTTATTTTT
TTTCCACTTA
AAGAATTCTG
CAGTAAACAA
TAAAACAATT
GTTTAATTAA
ATATGTATAT
ATATCATCAT
TTTAGGAATT
AATTTGAACA
CCAACCTGAA
GTGTTTATGA
ATACACAATC
ACCCTAAAAG
GGGTTTTTTT
TTCTTACTAA
CTAATTGGTG
ATGTATATAC
GTATTGCATG
TGTTTTTTTC
TTGCAGTTAT
CTATATGTTA
TTGTTTAATA
AAATATCAAG
GTCATTTGAG
TTGTCAACAA
AACAAAACAA
ATTATGTTGA
TTATGTATAT
ACTAAACTAC
TTCTAAATGG
TTCTAGTAAG
TATATTTTGA
TTTGTGCGTG
TGGAACTATT
GGCTTGAGGC
AGATTATTGT
TTAGCTGACG
TGATCTTTCA
ATAAAACGTA
CCTTAAAAGA
ATTCCTTCGC
ATATTTTTTC
U SME~r(RULE% ,WO 95/35318 P’TUS95/07744 TGTATTTTTC GGAAAATGTT AAAAACTAAT TATACACAAT TTACTTTCTC
TATTTTACGT
ACTAGTAATT
TTACTTGGGA
TATGGGCAAA
AAAACGAAA
CAACTTCTCA
TCTTAATCAC
TACTAGGAGT
TCATATTTAT
TTAACATTAT
CATACAAATT
GTGTAGTTTG
AGCAAAAACA
ACTATTTTTC
GTTAGAATTC
TAAAATGGGA
AAGCTCTATC
AAAAAAGTAG
CCTATAAACT
AGTTTCTATA
CACTCACCTT
CTCTCTCTCT
TTAGAGAGTA
AAGTCGGACC
GAATCCGACA
TTCTTTTACT
ACATAGAACA
TTAAAGTTCG
AAAGACTCAT
TAAAGTTTCC
GGGACTGTAA
AGGATGTATC
CGATGTCGTT
TACTGTTTTT
TGGTAGATAG
GATGGTGCAA
GGAAGGGAAC
GAGAGATAGA
CTCCCACTTA
CATTAATCTC
ATTTATTCGT
TGTATTA.AAG
TTTACAGAAC
TCTTTTTTTC
TAGTAAACTA
TTTTTGTTCC
CTCTTAACAC
TP.CCTTATCA
ACTGTAGAGA
AATCATGAAC
CCGGAAAAAT
CATAGCTTTC
AATGCTTCTC
CCTCTCTATA
ATTTACTCTG
CGACCCGACC
AAGCGGCAAG
TTCACCTTCC
TGTATGTCTC
CCAACTTCTC
GTTCTTTTTT
CATTTATATT
TCTTTATTAT
CAGGTGGCTG
AAAACCGTTA
AAGCCTCTTT
TTTTTCCTCT
ACAGTTAATG
TGCATCAGAG
TAGTAGTAGA
AACCATAATA
CAACTTCTCC
TACTCATCAT
GAAAAACATT
AGAATTTACA
ATAAAATTTT
AACAAAATCT
AAAAGTGTGG
AATTTCCAAG
ATAAAAGTAA
TTTAGAACTA
CTAGAGACTA
TACTCGCCTT
AAAATAAAAA
TCTGTTCTTT
TTTCCTCTCG
TATTAAACCC
CTTCTGTTCT
CGAGACTTAG
CTtTCTCTTT
TATCTCATGC
TTCAAAAACT
AACCTTTGGT
TGGTATCATT
ATTTTTTGCA
AAAAGGTTAA
AGATGGGTAC
CATGTGGCCA
ACACTAAACT
TGCAAAATTA
TAGTATATAG
TGATGATGTG
AAGGGAAATA
ATGAGTTAAC
‘ITCTGGGCAA
TAATACGTTG
TAAATGTCCC
TTAGCTGTCA
GAAAATAGAT
ATTTATATTT
ACCAACACAA
CAGCAAATAT
AAAAAGCATT
GCTAATATTT
TAAATAGAGG
CTCCAACTTT
GTCTATTTCT
ACTTATACCT
AACAAATCTT
TATCTACTTA
CACTCTGAA.A
TCGGCGTGGA
TCACCGACCT
TGGTAATAAC
CTGTTTGTTT
TAATCCAAAA
TCTATTTTTT
ACCAAATGAT
AAACATATAA
GGAGAAGAAG
AAAACTCGAT
CGCTTACGTC
GAGCTGATGT
ATGGGGTTGA
GAATTTAATA
AATACAGTAC
GCAGACATAG
GTTTTCCACA
AAGCCCACTA
TAATTATAAG
AGCGCCACTG
CTTCTTTTTT
TTTAAATCAT
GGAAGGAATA
CAATGATCAG
AAATTCATAT
ATTGAGAAGA
TAAAACTCAT
CTTTCCTTTA
CACGTTATAC
CCTCACTTCT
ACTCTTCTTC
GAACCAAAAC
GGACGTGGAA
TTTGGGTGAC
ATGTTTCACA
TTTGAACCTA
AACCCATTTT
TCCGATTCTT
ACCCGAGTAA
TAACGGAAA
GAGATAGTGG
TTAAATCACA
TTGGGCCTTC
TCTCAACTCT
ATTTACATTT
GGGCAAATGA
AGTGTGAATT
AAGTAAGAGG
CCGCCATTTT
TCAATGCTCG
TTTCAAAATT
AGATTTAATT
AAAAAGAGAA
CATGTAAGAA
TGAACATTAT
TCGATTTTGT
TAAATTCTTT
TTAAAATCTG
ATACAAAAAA
AGAACTTTTA
CATAAATAGT
AAATCCAAAT
ATATATATAG
CTCATTTCCA
TAACTCTAAT
ATGACGGTGG
CGACGGTGTG
CCGATTTGTA
ATCTTTTATC
GAAGTAGAAA
CCATAAACAA
GATAAGATCA
CTATAACTAA
TTTAAATTAT
GCATGATTAG
AATCTCAAAA
GCGTCTCTCC
720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 ROM *%TTESfEE(OIL LV WO 95/35318 JICT/US$95/07744
TTTTCACAGG
GTTATCTAAA
ATGATGGAGG
GATAATCAAG
CCGTCGATTT
GTTATCAAGT
GAGTTTTTCC
GTCGCGGTGC
AAATTCCTCG
TCGTTTCTGT
CGAGTAGTTG
CCTTTTGGAC
GTGAAATCCT
GCGGCGGAAG
TCGTGTGACG
GTTGGTGATT
TAAAACTTTT
GGGGTTTTCT
CTTTTTTTGC
GAGGTAATAT
TTAAGATACT
GATAAAATTT
TTGGAGATAA
TCGTGTGTGC
AATATT
T.ACCCTTCCG
CTAGTTTTG
AATGGTTTAG
CTTCTGTGAA
TGGTTAACCC
TAGAGCCGGT
CGCGGGATAT
CACGTGGAAG
AATATCCACC
TAGAAGTACG
ATAAAACGTT
TTCATTTTAT
TGTGTGCTCA
TTGCCGGAGA
AGCI.ATCTTTG
GGACTAAATC
TTTTTAACTC
TCTAAGTTTA
TTTTTGTTTT
CTCCTACTTT
TTTTCTTTGT
CATGAACGCA
AATTACAATA
CATCAAGTAT
TTTTCCTCCC
TTTTTGCAGG
ACAAAACGGA
TTTGTTCACC
GGTTTACGCT
TGATGCTGAG
TGATTCGGTA
CTGTTATGGA
CGAGTCATGG
TGGAGCGTCG
GCCGTTTCTG
GTATGGAATC
CGCGCATAAC
AGACCCGTTG
GTGTATAAAG
GCCACCTGGC
TATAATATAT
TAGATTTTCT
GTTTTGTTTT
TGGGTTTGTG
GGCCAAACCA
CTGATACGTA
TGACAATGAT
AACTAAGAGA
ACTCATAATC
AGACAAGGGA
GCTGAGTATT
GGGAAATGTG
CATCGAGTTA
ACGTTGTACC
CTTAATAACA
TCCGGGTCTG
GCCGTATTAA
AGATTGAGAC
AAACTACCTT
GGAGGAGAAG
TTGGCTAAGG
CGGCGAGGAA
CGGCTTGGAG
GTTTCCATTT
ATTCTCTATT
TGTTTTAGAA
GTTTTTGTGG
TCTTCTTGTC
AAACGCCGAC
TAATGATGCA
AGAAAATGTT
AAGACGCACA
ACACGCTATT
TTGGGTTTAA
CGTATATTGC
GTTATTCGGA
ATGTTTCGCG
GAATCCGGTT
AACTCTCGCT
GATCATGGCC
GCGTGTGGAA
GTGTGGTGGC
CGATACCGTC
GTCCACGCGC
CAGGTGGTTG
TACCACATTG
ATGACTATAG
TTGTAGACCC
AACCACTTGA
TTAATCTTTT
GTGTTATAAA
TTGTAAATGG
CTGATTATTA
ATTTGTGTTA
ACCAATPAACG
TTTTCTTTAA
ATAGATTTTG
GCTCGTGAAG
AACTGAGAAC
GTTTCGTACA
GCGAGTCACG
TAGCACAACA
TGGGACTTTC
CGGTTCGGCT
TTGTAAAGAC
TAAAACGACG
CGTTTTCGAA
GGTGAAGATG
TGGTGTCGTG
GAAAGTGCTA
TGATGGTGTT
TAGAGAATTT
TGTTAAATTA
TTTTAGGTAA
TTAGTGGTAA
ATCTAGCTTT
TTTCCAAGTA
AGACGATACT
ATTAGCATTA
GAGTAAATAA
2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4146 INFORMATION FOR SEQ ID NO:16: SEQUENCE CHARACTERISTICS: LENGTH: 398 amino acids TYPE: amino acid STRANPEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: fill n HET(UL 6 WO 95/35318 ‘CT/)US95/07744 Met Thr Val Val Arg 1 5 Glu Tyr Asp Pro Thr Arg Asp Leu Val Gly Val 10 Glu Asp Leu Phe Pro Ser Ile Val Lys Leu Tyr Thr Arg Arg Phe Arg 130 Asn Gin 145 Phe Arg Asn Val Glu Thr Asp Ile 210 Ala Val 225 Gly Ser Ser Val Ser Arg Thr Leu 290 Phe Gly 305 Val Lys Ala Gly Val Thr Tyr Gly Asp Lys Gin 115 Gin Ala Thr Ser Leu 195 Asp Pro Ala Trp Leu 275 Pro Leu Met Gly Glu Asp Leu Met Leu Leu 100 Gly Asn Ser Pro Arg 180 Tyr Ser Arg Lys Asn 260 Arg Phe His Val Cys 340 Arg Leu Met Ile Asn Ala Ile Gly Val Ser 165 Arg Arg Val Gly Phe 245 Cys Arg Leu Phe Lys 325 Gly Arg Leu Leu Arg 70 His Tyr Gly Ala Asn 150 Ile Val Ile Leu Ser 230 Leu Lys Val Lys Met 310 Ser Val Cys Gly Val 55 Gly Lys Val Phe Glu 135 Leu Leu Thr Arg Asn 215 Cys Glu Asp Val Leu 295 Tyr Leu Val Glu Asp 40 Ala Cys Ser Leu Lys 120 Tyr Phe Val Val Phe 200 Asn Tyr Tyr Ser Ala 280 Pro Gly Cys Ala Val 25 Pro Glu Ile Gin Gly 105 Leu Ser Thr Asn Ile 185 Ser Lys Gly Pro Phe 265 Lys Ser Ile Ala Aj a 345 Gly Ile Met Lys Asn 90 Leu Val Tyr Gly Pro 170 Lys Thr Leu Ser Pro 250 Leu Thr Ile Gly His 330 Glu Pro Cys Gly Thr 75 Asp Arg Lys Ile Lys 155 Val Leu Thr Ser Gly 235 Glu Leu Arg Pro Gly 315 Ala Val Ser Arg Thr Val Val Val Met Ala 140 Cys Tyr Glu Glu Leu 220 Ser Ser Glu Arg Ser 300 Glu His Ala Gly Ile Glu Thr Val Ser Met 125 Thr Gly Ala Pro Phe 205 Gly Gly Trp Val Val 285 Val Gly Asn Gly Lys Arg Lys Cys Lys Pro 110 Glu Glu Tyr His Val 190 Phe Thr Ser Ala Arg 270 Val Phe Pro Leu Glu 350 Leu His Lys Gly Pro Phe Glu Asn Ser Arg 175 Asp Pro Phe Trp Val 255 Gly Asp Glu Arg Ala 335 Asp Ser Ser Glu Gin Leu His Trp Asp Glu 160 Val Ala Arg Val Pro 240 Leu Ala Lys Pro Ala 320 Lys Pro Leu Arg Arg Gly Ile Pro His Trp Lys Val Leu Ser Cys Asp Glu Asp SUBSTITUTE SHEET (RULE 26) WO 95/35318 PCT/US95/07744 76 355 360 365 Leu Trp Cys Ile Lys Arg Leu Gly Asp Asp Tyr Ser Asp Gly Val Val 370 375 380 Gly Asp Trp Thr Lys Cys His Leu Ala Phe Pro Phe Leu Glx 385 390 395 INFORMATION FOR SEQ ID NO:17: SEQUENCE CHARACTERISTICS: LENGTH: 12 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: GAGTTGCGCA TG 12 INFORMATION FOR SEQ ID NO:18: SEQUENCE CHARACTERISTICS: LENGTH: 4 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: Gly Val Ala His 1 INFORMATION FOR SEQ ID NO:19: SEQUENCE CHARACTERISTICS: LENGTH: 24 base pairs TYPE:.nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: SUBSTITUTE SHEET (RULE 26) WO 95/35318 PCT17US95/07744 77 TGCTACAATC AGAATTCTTG CAGT 24 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 8 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID Ala Thr Ile Arg Ile Leu Ala Val 1 INFORMATION FOR SEQ ID NO:21: SEQUENCE CHARA2TERISTICS: LENGTH: 23 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: GGATCCTCTA GTCALATTAC CGC 23 INFORMATION FOR SEQ ID NO:22: SEQUENCE CHARACTERISTICS: LENGTH: 24 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: AGATCTGGTA rATTCCGTCT GCAC 24 INFORMATION FOR SEQ ID NO:23: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid SUBSTITUE SHEET (RULE 26) ‘WO 95/35318 PCT1US95/07744 78 STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: CCGGATTCGG TTTGTAGC 18 INFORMATION FOR SEQ ID NO:24: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: GACGTGCATG TTCTTGGG 18 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ in GAAAGCCACA TCACCTGC 18 INFORMATION FOR SEQ ID NO:26: SEQUENCE CHARACTERISTICS: LENGTH: 17 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO SUBSTITUTE SHEET (RULE 26) WO 95/35318 PCT/US95/07744 79 (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: GGGGTGGAGT TATCCAC 17 INFORMATION FOR-SEQ ID NO:27: SEQUENCE CHARACTERISTICS: LENGTH: 17 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: GACACCGGGA AGTATCG 17 INFORMATION FOR SEQ ID NO:28: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: CTGCTTTCAT AGAAGAGGC 19 INFORMATION FOR SEQ ID NO:29: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: GTCAGAACAA ACCTGCTCC 19 SUBSTITUTE SHEET (RULE 26 WO 95/353 8 KPCI/U95/07744 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 17 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID CACCCAGGTC TTGGTGG 17 INFORMATION FOR.SEQ ID NO:31: SEQUENCE CHARACTERISTICS: LENGTH: 16 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: GGCCGCCATG GATGCG 16 INFORMATION FOR SEQ ID NO:32: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: TCTCAATCAA GAGGAGGC 18 INFORMATION FOR SEQ ID NO:33: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear SUBSTITUTE SHEET (RULE 26) WO 95/35318 PCT/US95/07744 81 (1i) MOLECULE TYPE: cDNA (i1i) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: CTTGAAGGAT CCGAGTGG 18 INFORMATION FOR SEQ ID NO:34: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: CAGGTTGGCG AGTTCCTCG 19 INFORMATION FOR-SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID CTTGCTGTTA TTCTCCATGC INFORMATION FOR SEQ ID NO:36: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES SUBSTITUTE SHEET (RULE 26) ‘WO 95/35318 I’c/US95/077.I.1 82 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: CCCTGGACCA GCTCCTGG 18 INFORMATION FOR SEQ ID NO:37: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: TGGCGCAAGC ATCGTCCC 18 INFORMATION FOR SEQ ID NO:38: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQLboZE DESCRIPTION: SEQ ID NO:38: AAATGTTCAG GAATCTCTCG INFORMATION FOR SEQ ID NO:39: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:39- CTGGCTGGCA GCCACGCC 18 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: SUBSTITUTE SHEET (RULE 261 WO 95/35318 ‘Cr1195/07744 83 LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID GCGTTCTCAA AGCTGCGG 18 INFORMATION FOR SEQ ID NO:41: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: ACTGATGGGT CTTCTGGG 18 INFORMATION FOR SEQ ID NO:42: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: GGATCAGGAT GGACCCGG 18 INFORMATION FOR SEQ ID NO:43: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA SUBSTITUTE SHEET (RULE 26) ‘WO 95/35318 IPCTUS95107744 84 (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: TGGTTGCTGA AGCCAGGG 18 INFORMATION FOR SEQ ID NO:44: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: TCCATTCATA GAGAGTGGG 19 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID ATGCCCAAGA ACATGCACG 19 INFORMATTON FOR SEQ ID NO:46: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: SUBSTITUTE SHEET (RULE 26) WO 95/35318 PCT/US95/07744 CAACTGATCC TTTACCCTGC INFORMATION FOR SEQ ID N0:47: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:47: GTTGTTAGGT CAACTTGCG 19 INFORMATION FOR SEQ ID NO:48: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE:.nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: CTCTGTTAGG GCTTCCTCC 19 INFORMATION FOR SEQ ID NO:49: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:49: GAATCAGATT TCGCGAGG 18 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid SUBSTITUTE SHEET (RULE 26) WO 95/35318 PCTIUS95/07744 86 STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID GTCCAAATGG AGGAAGCC 18 INFORMATION FOR SEQ ID NO:51: SEQUENCE CHARACTERISTICS: LENGTH: 23 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:51: CCACGACTGT ACAATTGACC TTG 23 INFORMATION FOR SEQ ID NO:52: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:52: CATGATCGCA AGTTGACC 18 INFORMATION FOR SEQ ID NO:53: SEQUENCE CHARACTERISTICS: LENGTH: 22 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO SUBSTITUTE SHEET (RULE 261 WO 95/35318 I’CT/US95/07744 87 (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:53: AGAAAACTCT TATCAAGCTA CG 22 INFORMATION FOR SEQ ID NO:54: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:54: AAGCTTATGG GTGCTCGTGC INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID GGAAAGAGAG AAAGACTCAG INFORMATION FOR SEQ ID NO:56: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:56: GCCACCAAGT CATACCCG 18 SUBSTITUTE SHEET (RULE 261 WO 95/35318 PCT/US95/07744 88 INFORMATION FOR SEQ ID NO:57: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:57: CCTTCTATAT TTGGTTCC 18 INFORMATION FOR SEQ ID NO:58: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:58: CCATTCTCCG GAATAATCC 19 INFORMATION FOR SEQ ID NO:59: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:59: CACGGAGCAG GATAAGGGTA INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear SuSmmsT S(r(M2) WO 95/35318 PCT/US95/07744 89 (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID CGGATTGGAT TGTGTGTGC 19 INFORMATION FOR SEQ ID NO:61: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:61: CGCCACTGCA TGTAAGAAC 19 INFORMATION FOR SEQ ID NO:62: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:62: TCCACACGCT TAATACGGC 19 INFORMATION FOR SEQ ID NO:63: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES SlWlSmIEsT r(1BEl2) WO 95/35318 PC1TUS95/07744 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:63: GGTACGGAGA AGAAGGAG 18 INFORMATION FOR SEQ ID NO:64: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:64: CGCGGGATAT TGATTCGGT 19 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID GTGTTGAACA CGCCCACAA 19 INFORMATION FOR SEQ ID NO:66: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:66: ACGACACCAC AACCACCT 18 INFORMATION FOR SEQ ID NO:67: SEQUENCE CHARACTERISTICS: SUSnIE SET(e (I )e WO 95/35318 PCTUS95/07744 LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear MOLECULE TYPE: DNA (genomic) HYPOTHETICAL: NO ANTI-SENSE: YES (ii) (iii) (iv) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:67: GACAAGAAGA CACAAACC INFORMATION FOR.SEQ ID NO:68: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO:68: GAATCGGAGG AGAAGGTC INFORMATION FOR SEQ ID NO:69: SEQUENCE CHARACTERISTICS: LENGTH: 240 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:69: Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 1 5 10 Xaa Met Phe Gly Tyr Arg Ser Asn Val Pro Lys Val Arg Asp Arg Leu Val Val Arg Leu Val His Asp Arg Asp Ala 40 Ala Asp Tyr Tyr Ala Glu Asn Arg His Phe Leu Lys Pro 55 Xaa Xaa Xaa Thr Thr Trp Arg Leu Trp Glu Pro us ikrWmEWis
I
WO 95135318 3PCTUS95/()7744 92 Val Arg Asp Glu Ser His Cys Tyr Pro Ser Gly Trp Gin Ala Arg Leu 70 75 Gly Met Ile Ash Glu Phe His Lys Gin Gly Ser Ala Phe Tyr Phe Gly 90 Leu Phe Asp Pro Asp Glu Lys Glu Ile Ile Gly Val Ala Asn Phe Ser 100 105 110 Asn Val Val Arg Gly Ser Phe His Ala Cys Tyr Leu Gly Tyr Ser Ile 115 120 125 Gly Gin Lys Trp Gin Gly Lys Gly Leu Met Phe Glu Ala Leu Thr Ala 130 135 140 Ala Ile Arg Tyr Met Gin Arg Thr Gin His Ile His Arg Ile Met Ala 145 150 155 160 Asn Tyr Met Pro His Xaa Xaa Xaa Xaa Asn Lys Arg Ser Gly Asp Leu 165 170 175 Leu Ala Arg Leu Gly Phe Glu Lys Glu Gly Tyr Ala Lys Asp Tyr Leu 180 185 190 Leu Ile Asp Gly Gin Trp Arg Asp His Val Leu Thr Ala Leu Thr Thr 195 200 205 Pro Asp Trp Thr Pro Gly Arg Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 210 215 220 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 225 230 235 240 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 240 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 1 5 10 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Met Glu Thr Glu Ile Lys Val Ser 25 Glu Ser Leu Glu Leu His Ala Val Ala Glu Asn His Val Lys Pro Leu 40 Tyr Gin Leu Ile Cys Lys Asn Lys Thr Trp Leu Gin Gin Ser Leu Asn 55 Trp Pro Gin Phe Val Gin Ser Glu Glu Asp Thr Arg Lys Thr Val Gin 70 75 SUBSTITUTE SHEET (RULE 26) WO 95/35318 P’CT/US95/07744 93 Gly Asn Vai Xaa Met Leu His Gin Arg Gly Tyr Ala Lys Met Phe Met 90 Ile Phe Xaa Xaa Lys Glu Asp Giu Leu Ile Gly Val Ile Ser Phe Xaa 100 105 110 Asn Arg Ile Giu Pro Leu Asn Lys Thr Ala Glu Ile Gly Tyr Trp Leu 115 120 125 Asp Glu Ser His Gin Gly Gin Gly Ile Ile Ser Gin Ala Leu Gin Ala 130 135 140 Leu Ile His His Tyr Ala Gin Ser Gly Glu Leu Arg Arg Phe Vai Ile 145 150 i55 160 Lys Cys Arg Val Asp Xaa Xaa Xaa Xaa Asn Pro Gin Ser Asn Gin Val 165 170 175 Ala Leu Arg Asn Gly Phe Ile Leu Giu Gly Cys Leu Lys Gin Ala Giu 180 185 190 Phe Leu Asn Asp Ala Tyr Asp Asp Val Asn Leu Tyr Ala Arg Ile Ile 195 200 205 Asp Ser Gin Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 210 215 220 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 225 230 235 240 INFORMATION FOR SEQ ID NO:71: Wi SEQUENCE CHARACTIERISTICS: LENGTH: 24C amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7i: Xaa Xaa Xaa Xaa Xaa Xaa Xaa Met Leu Trp Ser Ser Asn Asp Vai Thr 1 5 10 Gin Gin Gly Ser Arg Pro Lys Thr Lys Leu Gly Gly Ser Xaa Met Ser 25 Ile Ile Ala Thr Val Lys Ile Gly Pro Asp Giu Ile Ser Ala Met Arg 40 Ala Val Leu Asp Leu Phe Gly Lys Glu Phe Giu Asp Ile Pro Thr Tyr 55 Ser Asp Arg Gin Pro Thr Asn Glu Tyr Leu Ala Asn Leu Leu His Ser 70 75 Giu Thr Phe Ile Ala Leu Ala Ala Phe Asp Arg Gly Thr Ala Ile Gly 90 SUB8STITUTE SHEET (RULE 26) WO 95/35318 WO 9535318PCT/US95107744 Gly Leu Ala Xaa Xaa Ala Tyr Val 1.00 Ser Glu Xaa Xaa Xaa Xaa Xaa Xaa 115 120 Ala Ser Ser His Arg Arg Leu Gly 130 135 Leu Lys Arg Xaa Val Ala Val Glu 145 150 Gln Ala Asp Tyr Gly Xaa Xaa Xaa 165 Tyr Thr Lys Leu Gly Val Arg Glu 180 Pro Arg T1-r Ala Thr Xaa Xaa Xaa 195 200 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xa 210 215 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 225 230 INFORMATION FOR SEQ ID NO:72: SEQUENCE CHARACTERISTICS: LENGTH: 240 amino acids TYPE: amino i STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO Leu Prc 105 Ile Tyr Val Ala Leu Gly Xaa Asp 170 Asp Val 185 Xaa Xaa Xaa Xaa Xaa Xaa Lys Ile *Thr *Ala Asp Met Xaa Xaa Xaa 235 Ser Gly Gin Gly Gly2 75 Gin C Lys I Phe Tyr Al a 140 Tyr Pro His Xaa Xaa 220 Xaa S er G1y lai %sp ksn iu ?he Glu Asp 125 Leu Val Ala Phe Xaa 205 Xaa Xaa Asn Ser Lys Val Leu I Ala I Giu C 1 Gin 110 Leu Ile Ile Val Asp 190 Xaa Xaa Xaa iArg tVal His Val 160 Leu Asp Xaa Xa.
Xaa 240 Thr Gly Arg Tyr Ser Gly Arg (xi) Xaa Gin Ile Ala Ser Lys Ala SEQUENCE DESCRIPTION: SEQ ID NO:72: Xaa Xaa Xaa Xaa Xaa Xaa Met Leu Arg 10 Gin Gly Ser Arg Pro Lys Thr Lys Leu Ile Arg Thr Cys Arg Leu Gly Pro Asp 40 Ala Leu Asp Leu Phe Gly Arg Giu Phe s0 Gin His Gin Pro Asp Ser Asp Tyr Leu 70 Thr Phe Ile Ala Leu Ala Ala Phe Asp 90 Leu Ala Xaa Xaa Ala Tyr Val Leu Pro 100 105 )6sp Val Ser Met 3er Met kla Thr jeu Arg Tal Val ;in Ala vuBSTW EEff(RU1f26) WO 95/35318 INVS95/07744 Ser 3lu Xaa Xaa Xaa Xaa Xaa Xaa Ile Tyr Ile Tyr Asp Leu Ala Val 115 120 125 Ser Gly Giu His Arg Arg Gin Giy Ile Ala Thr Ala Leu Ile Asn Leu 130 135 140 Leu Lys His Xaa Glu Ala Asn Ala Leu Gly Ala *.yr Val Ile Tyr Val 145 150 155 160 Gin Ala Asp Tyr Gly Xaa Xaa Xaa Xaa Asp Asp Dro Ala Val Ala Leu 165 170 175 Tyr Thr Lys Leu Gly Ile Arg Glu Giu Val Met His Phe Asp Ile Asp 180 185 190 Pro Ser Thr Ala Thr Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 195 200 205 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 210 215 220 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 22 230 23524 INFORMATION FOR SEQ ID NO:73: Wi SEQUENCE CHARACTERISTICS: LENGTH: 240 amino acids TYPE: amino acid STR.ANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: Met Thr Thr Leu Asp Asp Thr Ala Tyr Arg Tyr Arg Thr Ser Val Pro 1 S 10 Gly Asp Ala Giu Ala Ile Giu Ala Leu Asp Gly Ser Phe Thr Thr Asp 25 Thr Val Phe Arg Val Thr Ala Thr Gly Asp Gly Phe Thr Leu Arg Glu 40 Val Pro Val Asp Pro Pro Leu Thr Lys Val Xaa Xaa Phe Pro Asp Asp 55 Glu Ser Asp Asp Glu Ser Asp Asp Gly Glu Asp Gly Asp Pro Asp Ser 70 75 Arg Thr Phe Val Ala Tyr Gly Asp Xaa Xaa Xaa Xaa Xaa Xaa Asp Gly 90 Asp Leu Ala Xaa Xaa Gly Phe Val Val Ile Ser Tyr Ser Ala Trp Asn 100 105 110 Arg Arg Xaa Xaa Xaa Xaa Xaa Xaa Leu Thr Val Glu Asp Ile Glu Val 1LE 120 125 SUBSTITUTE SHEET (RULE 2) WO 95/35318 PCT/US95/07744 96 Ala Pro Glu His Arg Gly His Gly Val Gly Arg A eu Met Gly Leu 130 135 1 Ala Thr Glu Xaa Phe Ala Gly Glu Arg Gly Ala Gly His Leu Trp Leu 145 150 155 160 Glu Val Thr Asn Val Xaa Xaa Xaa Xaa Asn Ala Pro Ala Ile His Ala 165 170 175 Tyr Arg Arg Met Gly Phe Thr Leu Cys Gly Leu Asp Thr Ala Leu Tyr 180 185 190 Asp Gly Thr Ala Ser Asp Gly Glu Arg Gln Ala Leu Tyr Met Ser Met 195 200 205 Pro Cys Pro Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 210 215 220 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 225 230 235 240 INFORMATION FOR SEQ ID NO:74: SEQUENCE CHARACTERISTICS: LENGTH: 240 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:74: Met Thr Thr Thr His Gly Ser Thr Tyr Glu Phe Arg Ser Ala Arg Pro 1 5 10 Gly Asp Ala Glu Ala Ile Glu Gly Leu Asp Gly Ser Phe Thr Thr Ser 25 Thr Val Phe Glu Val Asp Val Tnr Gly Asp Gly Phe Ala Leu Arg Glu 40 Val Pro Ala Asp Pro Pro Leu Val Lys Val Xaa Xaa Phe Pro Asp Asp 55 Gly Gly Ser Asp Gly Glu Asp Gly Ala Glu Gly Glu Asp Ala Asp Ser 70 75 Arg Thr Phe Val Ala Val Gly Ala Xaa Xaa Xaa Xaa Xaa Xaa Asp Gly 90 Asp Leu Ala Xaa Xaa Gly Phe Ala Ala Val Ser Tyr Ser Ala Trp Asn 100 105 110 Gin Arg Xaa Xaa Xaa Xaa Xaa Xaa Leu Thr Ile Glu Asp Ile Glu Val 115 120 125 Ala Pro Gly His Arg Gly Lys Gly Ile Gly Arg Val Leu Met Arg His 130 135 140 SUBSTITUTE SHEET (RULE 26) WO 95/353 18 PCT1US95/07744 Ala Ala Asp Xaa Phe Ala Arg Glu 145 150 Glu Asn Thr Asn Val Xaa Xaa Xaa 165 Tyr Arg Arg Met Gly Phe Ala Phe 180 Gin Gly Thr Ala Ser Glu Gly Glu 195 200 Pro Cys Pro Xaa Xaa Xaa Xaa Xaa 210 215 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 225 230 INFORMA:TION FOR SEQ ID ()SEQUENCE CHARACTERISTICS: LENGTH: 240 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO Arg Gly Ala 155 Xaa Asn Ala 170 Cys Gly Leu 185 Xaa His Ala Xaa Xaa Xaa Xaa Xaa Xaa 235 Gly Pro Asp Leu Xaa 220 Xaa Ser His Phe Xaa Xaa Xaa Xaa Glu Ser I 140 Leu C His Ala Ser Tyr 205 Xaa Xaa Val Phe Glu Ile Kaa (aa 3er L2 .eu 1y Let Ile Ala 190 Met Xaa Xaa Ile Ile Leu Ser Xaa Xaa Thr 110 Ile Ile Ile TrP His 175 Leu *Ser *Xaa Xaa Pro Val Ser Asp Asp Asp Trp, Val Giu Arg Leu 160 Ala Tyr Met Xaa Xaa 240 Glu Arg Thr Asp Ser Gin Asn Val Phe Leu 160 (xi) Xaa Gin Giu Arg Asp Ala Giu Asp Ser Ala 145 SEQUENCE DESCRIPTION: SEQ ID Xaa Xaa Xaa Xaa Xaa Xaa Xaa Met Lys 10 Val Ala Giu Thr Leu Asp Ala Xaa Glu 25 Val Phe Asp Val His Leu Ser Asp Gin Ser Val Ser Pro Tyr Arg Ly6 Asp Tyr 55 Ser Asp Giu Xaa Xaa Xaa Xaa Xaa Xaa 70 Cys Tyr Gly Ala Phe Xaa Ile Xaa Xaa 90 Leu Val Xaa Xaa Gly Lys Ile Glu Leu 100 105 Leu Xaa Xaa Xaa Xaa Xaa Xaa Ala Ser 115 120 His Thr His Arg Gly Lys Gly Val Ala 130 135 Lys Lys Xaa Trp Ala Leu Ser Arg Gin 150 Ile Asn Gly Xaa Xaa 75 Xaa Asn Ile His Leu 155 SUBSTITUTE SHEET (RULE 26)
M
WO 95/353 18 W095/S318 CT/US95/07744 Giu Thr Gin Thr Asn Xaa Xaa 165 Tyr Ala Lys Cys Gly Phe Thr 180 Lys Thr Arg Pro Gin Val Ser :195 Phe Ser Gly Ala Gin Asp Asp 210 215 Xaa Xaa Xaa Xaa Xaa Xaa Xaa 225 230 INFORMATION FOR SEQ ID NO:76: SEQUENCE CHARACTERISTICS LENGTH: 240 amino a TYP~E: amino acid STRANDEDNESS: singli TOPOLOGY: linear (ii) MOLECUL~E TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO Xaa Xaa Asn Val Pro Ala Cys Asn Leu 170 175 Leu Gly Gly Ile Asp LeJ. Phe Thr Tyr 185 190 Asn Giu Thr Ala Met Tyr Trp Tyr Trp, 200 205 Ala Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 220 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 235 240 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:76: Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Met Al a Xaa Met Phe Giu Gly Lys Met Leu 145 Lys Xaa Glu Gly His Phe Leu Ser 130 Ser Phe Asp Asp Giu Trp Ala Xaa ii1S Asp Gin Lys Ile Gin His Thr Xaa 100 Xaa Tyr Xaa Ile Leu Val Pro Pro Xaa Xaa Arg Val Trp 165 Arg Pro Arg Leu Ile Leu 55 Phe Tyr Xaa Xaa Met Tyr Xaa Xaa Gly Phe 135 Ala Met 150 Xaa Xaa Ai~.
Ile 40 Thr His Xaa Tyr Xaa i2 0 Gly Lys Xaa Thr Ala Ser 25 Lys Giu Leu Giu Lys Asp Cys Leu Vai Xaa Xaa Giu 90 Phe Thr Tyr 105 Leu Tyr Leu Ile Gly Ser Cys Arg Cys 155 Xaa Asn Giu 170 Asp Ala Leu Ala Gly Asp Giu Giu 140 Ser Pro Cys Lys Gin Giu His Pro Asp 125 Ile Ser S er is Ser Xaa Tyr Giu Giu Asp Val Pro Ser Ile Trp Ile 110 Phe Phe Leu Lys Met His Ile Asn 175 Leu Val Ala Giu SUBSTITUTE SHEET (RULE 261 WO 95/353 18 110/U$95/07744 99 Tyr Lys Arg Arg Gly Ala Ser Asp Leu Ser Ser Glu Glu Gly Trp Xaa 180 185 190 Xaa Xaa Xaa Xaa Arg Leu Phe Lys Ile Asp Lys Glu Tyr Leu Leu Lys 195 200 205 Met Ala Ala Glu Glu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 210 215 220 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 225 230 235 240 INFORMATION FOR SEQ ID NO:77: SEQUENCE CHARACTERISTICS: 1 jENGTH: 240 amino acids TYPE: amino acid STRANUEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:77: Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Met 1 5 10 Ala Lys Phe Val Ile Arg Pro Ala Thr Ala Ala Asp Cys Ser Xaa Xaa 25 Xaa Xaa Asp Ile Lsu Arg Leu Ile Lys Glu Leu Ala Lys Tyr Glu Tyr 40 Met Glu Glu Gin Val Ile Leu Thr Giu Lys Asp Leu Leu Glu Asp Gly 55 Phe Gly Giu His Pro Phe Tyr His Cys Leu Val Ala Giu Val Pro Lys 70 75 Glu His Trp Thr Pro Xaa Xaa Xaa Xaa Xaa Glu Gly His Ser Ile Val 90 Gly Phe Ala Xaa Xaa Met Tyr Tyr Phe Thr Tyr Asp Pro Trp Ile Gly 100 105 110 Lys Leu Xaa Xaa Xaa Xaa Xaa Xaa Leu Tyr Leu Giu Asp Phe Phe Val 115 120 125 Met Ser Asp Tyr Arg Gly Phe Gly Ile Gly Ser Giu Ile Leu Lys Asn 130 135 140 Leu Ser Gin Xaa Val Ala Met Az-g Cys Arg Cys Ser Ser Met His Phe 145 150 155 160 Leu Val Ala Glu Trp Xaa Xaa Xaa Xaa Asn Giu Pro Ser Ile Asli Phe 165 170 175 Tyr Lys Arg Arg Gly Ala Ser Asp Leu Ser Ser Glu Giu Gly Trp Xaa 180 185 190 SUBSTITUTE SHEET (RULE 26) ‘WO 95/353 18 PCT1US95/077414 100 Xaa Xaa Xaa Xaa Arg Leu Phe Lys le Asp Lys Glu Tyr Leu Leu Lys 195 200 205 Met Ala Thr Glu Glu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 210 215 220 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 225 230 235 240 INFORMATION FOR SEQ ID NO:78: SEQUENCE CHARACTERISTICS: LENGTH: 240 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:76: Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Met 1 5 10 Asn His Ala Gin Leu Arg Arg Val Thr Ala Glu Ser Phe Ala His Tyr 25 Arg His Gly Leu Ala Gln Leu Leu Phe Glu Thr Val His Gly Gly Xaa 40 Xaa Ala Ser Val Gly Phe Met Ala Asp Leu Asp Met Gin Gin Ala Tyr 55 Ala Trp Cys Asp Gly Leu Lys Ala Asp Ile Ala Ala Gly Ser Leu Leu 70 75 Leu Trp Val Val Ala Xaa Xaa Xaa Xaa Xaa Glu Asp Asp Asn Val Leu 90 Ala Ser Ala Xaa Xaa Gin Leu Ser Leu Cys Gin Lys Pro Asn Gly Leu 100 105 110 Asn Arg Xaa Xaa Xaa Xaa Xaa Xaa Ala Glu Val Gin Lys Leu Met Val 115 120 125 Leu Pro Ser Ala Arg Gly Arg Gly Leu Gly Arg Gin Leu Met Asp Glu 130 135 140 Val Glu Gin Xaa Val Ala Val Lys His Lys Arg Gly Leu Leu His Leu 145 150 155 160 Asp Thr Giu Ala Xaa Xaa Xaa Xaa Xaa Gly Ser Val Ala Glu Ala Phe 165 170 175 Tyr Ser Ala Leu Ala Tyr Thr Arg Val Gly Glu Leu Pro Gly Tyr Cys 180 185 190 Ala Thr Pro Asp Gly Arg Leu His Pro Thr Ala Ile Tyr Phe Lys Thr ‘195 200 205 SUBSTITUTE SHEET (RULE 261 WO 95/35318 WO 9535318PCIUS95/077,4 101 Leu Gly GJln Pro Thr Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 210 215 220 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 225 230 235 240 INFORMATION FOR SEQ ID NO:79: SEQUENCE CHARACTERISTICS: LENGTH: 240 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:79: Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 1 5 10 Xaa Gin Xaa His Arg Val Xaa Gin Ile 145 Giu Tyr Pro Xaa Asp Xaa Len Phe Gly Xaa Pro 130 Gin Thr Arg Asp Xaa Al a Xaa Xaa Len Cys Xaa 115 Thr Asp GlvI Lys Pro Xaa Val Xaa Xaa Vai Xaa 100 Xaa Ala Xaa Val Gin 180 Len Met Val Xaa Xaa Ala Xaa Arg Gin Tyr 165 Gly Ser Pro Gin Tyr Len 70 Xaa Giy Xaa Gly Ala 150 Xaa Phe Len Asn Len Len 55 Asp Xaa Ala Xaa Gly 135 Arg Xaa Ala.
Phe Val Ile 40 Gly Len Xaa Ile Xaa 120 Gin Ala Xaa Asp Met 200 Thr 25 Gln Asp Gin Xaa Al a 105 Gly Ile Ala Xaa Arg 185 Gin Xaa Ile Gin Leu Thr Xaa Ile Gin Gly Gi’; Gin 170 Gly Lys Ala Len Tyr Len 75 Arg Asp Val Arg Len 155 Ala Pro Pro Arg Asp Pro Ala Arg Thr Lys Arg 140 Ser Thr Phe Len Glu Arg Ala Lys S er Glu Arg 125 Leu Ala Arg Gly Xaa Ser Xaa Giu Pro Gly Gly 110 Met Len Leu Ile Pro 190 Xaa Pro Xaa Ser Asp Thr Gly Phe Glu Leu Ala 175 Tyr Xaa Len Xaa Asn Ile Val Tyr Val Arg Leu 160 Leu Gly Xaa 195 205 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 220 SUBSTITUTE SH-EET (RULE 26)~ NVO 95/35318 WO 9535318PCT[US95!07744 Xaa Xaa Xaa Xaa Xaa xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 225 230 235 240 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 240 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL,: NO (iv) ANTI-SENSE: 11O (xi) SEQUENCE DESCRIPTION: SEQ ID Xaa Xaa Xaa Xaa Xaa Met Pro Ile Asn Ile Arg Arg Ala Thr Xaa Ile 1 5 10 Asn Asn Al a Glu Asp Lys Asp Met Ala 145 His Tyr Giu Val Asp Tyr Ser Gin Gly Leu Gin Arg 130 Leu Val Arg Lys Leu 210 Ile Met Phe Asp Arg Val Gin Thr Phe Arg Asp Ser 195 Lys le Met Val Glu Thr Xaa 100 Asn Tyr Al a Gin Thr 180 Tyr Leu Cys Lys Al a Asn Ile Xaa Giu Arg Leu Ser 165 Leu Tyr Giu Met Gin Tyr Tyr Thr Thr 55 Asp Lys 70 Lys Leu Gly Tyr Pro Pro Arg Met 135 Arg Giu 150 Xaa Xaa Ala Phe Gin Asp Giu Leu 215 Asn Met 40 Thr Lau Asp Val Asn 120 Gly Val Xaa G fu Gly 200 Gin Ala 25 Tyr Thr Giu Pro Leu 105 Gly Ile His Xaa Val 185 Glu Ile Asn His Leu Leu Thr 90 Val His Ala Gin Asn 170 Leu Asp Ser Leu His Thr Leu Asp Cys Thr Leu 75 Tyr Leu Lys Met Ile Thr Giu Asn 140 Ala Giu 155 Arg Ala Ser Xaa Ala Tyr Asn Xaa 220 Asn Ser Giu Asp Ala Asn Ser 125 Leu Tyr Aia Xaa Ala 205 Xaa Leu Trp Asp Gly Pro Asp 110 Leu Met Val Leu Xaa 190 Met Xaa Pro Pro Ser Thr Gly Asp Ser Arg S er His 175 Xaa Lys Phe Glu Giu Asp Asn Glu Pro Val Gin Leu 160 Leu Ile Lys Thr His Arg ;rg Leu Lys Giu Asn Glu Giu Lys Leu Giu Asp Asp Leu Giu SUBSTITUTE SHEET (RULE 26) WO 95/35318 225 230 INFORMATION FOR SEQ ID NO:81: SEQUENCE CHARACTERISTICS: LENGTH: 240 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO PCT/US95/07744 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:81: Met Glu Ile Val Tyr Lys Pro Leu Asp Ile Arg Asn Glu Glu Gin Phe 1 5 10 Ala Ile Leu Xaa Gly Xaa Pro Glu Ala 145 Glu Tyr Met Ile His 225 Ser Tyr Thr Xaa Thr Xaa His Ser 130 Ile Thr Xaa Phe Leu 210 Gly Ile Val Tyr Xaa Pro Ile Arg 115 Thr Asp Glu Glu Arg 195 Pro Arg Lys Tyr Ile Xaa Asn Xaa 100 Asn Tyr Lys Val Gly 180 Tyr Leu Leu Lys Arg Ala Xaa Ile Xaa Val Arg Met Glu 165 Met Tyr Thr Ala Leu Tyr Xaa Xaa 70 Pro Gly Arg Gly Gln 150 Xaa Gly Leu Glu Thr 230 Ile Phe Xaa 55 Xaa Xaa Cys Leu His 135 Arg Xaa Phe Asn Lys 215 Xaa Asp Leu 40 Xaa Xaa Xaa Ile Arg 120 Gly Glu Xaa Ile Glu 200 Ser Xaa Ala 25 Asn Xaa Xaa Xaa Val 105 Gly Ile His Xaa Arg 185 Gly Cys Asp Gin Xaa Xaa Xaa 90 Cys Tyr Ala Cys Asn 170 Met Asp Thr Leu Xaa Xaa Xaa 75 Xaa Lys Ile Lys Asp 155 Ser Lys Ala Arg Ser Xaa Xaa Val Xaa Met Gly Lys 140 Glu Ala Xaa Phe Ser 220 Glu Xaa Xaa Asp Xaa Asp Met 125 Leu Xaa Ala Xaa Lys 205 Thr Pro Trp Xaa Asn Xaa Xaa 110 Leu Val Ile Leu Xaa 190 Leu Phe Tyr Pro Xaa Lys Xaa Xaa Ala Glu Met Asn 175 Xaa Xaa Leu Ser Glu Xaa Ser Xaa Xaa Val lie Leu 160 Leu Arg Xaa Met Xaa 240 Xaa Xaa Xaa 235 Xaa Xaa Xaa Xaa SUBSTITUTE SHEET (RULE 26) WO 95/35318 WO 9535318PCTUMS95/07744 INFORMATION FOR SEQ ID NO:82: Wi SEQUENCE CHARACTERISTICS: LENGTH: 240 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO ANTI-SENSE NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:82: Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 1. 5 10 Xaa Met Glu Ser Glu Gly Xaa Arg Leu 145 Gly Thr Lys Ile Xaa Asn.
Ala Trp Ser Trp Xaa Pro 130 Glu Thr Glu His Ile 210 Xaa Tyr Ala Pro Pro Ile Xaa 115 Asp Asn AspD Asp Pro 195 Pro Xaa Gin Asn Asp Asn Xaa 100 Xaa Tyr Arg Asp Asn 180 Tyr Asn Xaa Ile Ile Met Leu Xaa Xaa Gin Xaa Glu 165 Glu Al a Xaa Val Leu Thr 70 Cys Gly Xaa Asn Ala 150 Tyr Phe Phe Asn Xaa Asn Thr Ser Phe Leu Xaa Lys 135 Arg Tyr Asp Tyr Gly 215 Xaa Ile 40 Giu Ala Giy Arg Xaa 120 Gly Giu Arg Ser Gin 200 Lys Xaa 25 Aila Ala Thr Leu Pro 105 Xaa Ile Gin Thr Ile 185 Lys Asn Xaa Glu Phe Lys Leu 90 Met Xaa Gly Gly Ser 170 Lys Asn Lys X.aa Cys Asn Glu 75 Ile Tyr Leu Lys Ile 155 Leu Asn Gly Pro Xaa Ser Asp Val Asn Lys His Ile 140 Ile Ser Ile Tyr Asp 220 Xaa Asn Leu Lys Asn Giu Pro 125 Leu Gly Leu Lys Tyr 205 Ile Xaa Tyr Gly Giu S er Thr 110 Leu Leu Ile Ile Asn 190 Ile Trp Xaa Gin Asn Cys Leu Trp Val Lys Ala Thr 175 I’le Val Met Xaa Leu Asn Ile Val Glu Val Giu Leu 160 Ile Asn Gly Trp Ser Leu Ile Lys Xaa Xaa Xaa Xaa Xaa 235 Xaa Xaa Xaa Xaa SUBSTITUTE SHEET (RULE 26)
Claims (13)
1. An isolated nucleic acid sequence comprising a sequence selected from the group consisting of SEQUENCE ID NOS: 1 and 2.
2. An isolated protein sequence comprising the amino acid sequence set forth in SEQUENCE ID NO: 3.
3. An isolated nucleic acid sequence comprising the sequence set forth in SEQUENCE ID NO: 4.
4. An isolated iucleic acid sequence comprising the sequence set forth in SEQUENCE ID NO: An isolated protein sequence comprising the amino acid sequence set forth in SEQUENCE ID NO: 6.
6. An isolated nucleic acid sequence comprising the sequence set forth in SEQUENCE ID NO: 7. c o 7. An isolated protein sequence comprising the amino acid sequence set forth in SEQUENCE ID NO: 8.
8. An isolated nucleic acid sequence comprising the sequence set forth in SEQUENCE ID NO: 9.
9. An isolated protein sequence comprising the amino acid sequence set forth in SEQUENCE ID NO: An isolated nucleic acid sequence comprising the sequence set forth in SEQUENCE ID NO: 11. 105a
11. An isolated protein sequence comprising the amino acid sequence set forth in SEQUENCE ID NO: 12.
12. An isolated nucleic acid sequence comprising a sequence selected from the group consisting of SEQUENCE ID NO: 14 and
13. An isolated protein sequence comprising the amino acid sequence set forth in SEQUENCE ID NO: 16. *e 9. *a* WO 95/35318 PCT/US95/07744 106
14. A DNA sequence comprising a sequence complementary to an isolated nucleic acid sequence of claim 1. A transformed plant cell comprising the nucleic acid sequence selected from the group consisting of SEQUENCE ID NOS: 1, 2, 4, 5, 7, 9, 11, 14, and
16. A plant comprising a heterologous nucleic acid sequence selected from the group consisting of SEQ ID NOS: 1, 2, 4, 5, 7, 9, 11, 14, and
17. A DNA sequence comprising a sequence complementary to an isolated nucleic acid sequence selected from the group consisting of SEQ ID NOS: 1, 2, 4, 5, 7, 9, 11, 14, and
AU28650/95A
1994-06-17
1995-06-15
Plant genes for sensitivity to ethylene and pathogens
Ceased
AU686408B2
(en)
Applications Claiming Priority (3)
Application Number
Priority Date
Filing Date
Title
US261822
1994-06-17
US08/261,822
US5650553A
(en)
1992-06-16
1994-06-17
Plant genes for sensitivity to ethylene and pathogens
PCT/US1995/007744
WO1995035318A1
(en)
1994-06-17
1995-06-15
Plant genes for sensitivity to ethylene and pathogens
Publications (2)
Publication Number
Publication Date
AU2865095A
AU2865095A
(en)
1996-01-15
AU686408B2
true
AU686408B2
(en)
1998-02-05
Family
ID=22995037
Family Applications (1)
Application Number
Title
Priority Date
Filing Date
AU28650/95A
Ceased
AU686408B2
(en)
1994-06-17
1995-06-15
Plant genes for sensitivity to ethylene and pathogens
Country Status (5)
Country
Link
US
(1)
US5650553A
(en)
EP
(1)
EP0763060A4
(en)
AU
(1)
AU686408B2
(en)
CA
(1)
CA2193255A1
(en)
WO
(1)
WO1995035318A1
(en)
Families Citing this family (14)
* Cited by examiner, † Cited by third party
Publication number
Priority date
Publication date
Assignee
Title
US5955652A
(en)
*
1992-06-16
1999-09-21
The Trustees Of The University Of Pennsylvania
Plant genes for sensitivity to ethylene and pathogens
US6211437B1
(en)
1997-03-04
2001-04-03
Pioneer Hi-Bred International, Inc.
Nucleic acids from maize encoding proteins which suppress plant cell death
AU8066398A
(en)
*
1997-06-12
1998-12-30
E.I. Du Pont De Nemours And Company
Plant transcription coactivators with histone acetyltransferase activity
US6198020B1
(en)
1998-02-26
2001-03-06
Pioneer Hi-Bred International, Inc.
Nitric oxide as an activator of the plant pathogen defense systems
JP2000050877A
(en)
*
1998-08-11
2000-02-22
Natl Inst Of Agrobiological Resources
Transcription factor which regulates expression of ethylene-inducing gene cluster
JP2002541763A
(en)
*
1998-11-25
2002-12-10
ザ トラスティーズ オヴ ザ ユニヴァーシティー オヴ ペンシルバニア
Ethylene response factor 1 (ERF1) in plants
US6359198B1
(en)
1999-01-12
2002-03-19
Genesis Research & Development Corporation Ltd.
Compositions isolated from plant cells and their use in the modification
US6768041B2
(en)
*
1999-01-12
2004-07-27
Genesis Research And Development Corporation Limited
Compositions isolated from plant cells and their use in the modification of plant cell signaling
US20050050583A1
(en)
*
1999-01-12
2005-03-03
Agrigenesis Biosciences Limited
Compositions isolated from plant cells and their use in the modification of plant cell signaling
AU5925100A
(en)
*
1999-07-15
2001-02-05
Pioneer Hi-Bred International, Inc.
Maize ethylene signalling pathway ein3 genes and uses thereof
US6562595B2
(en)
*
2000-02-18
2003-05-13
Mcgill University
Dominant selectable marker for gene transformation and disruption in yeasts
KR100736797B1
(en)
*
2006-03-28
2007-07-09
대한민국(관리부서:농촌진흥청)
Scar primer set for the discrimination of mating type of phytophthora infestans
US8277650B2
(en)
2009-03-13
2012-10-02
Terrasep, Llc
Methods and apparatus for centrifugal liquid chromatography
CN114990139B
(en)
*
2022-04-24
2023-06-23
湖南农业大学
Application of CsHLS1 gene or protein encoded by same in regulation and control of organ size of cucumber plant
Family Cites Families (1)
* Cited by examiner, † Cited by third party
Publication number
Priority date
Publication date
Assignee
Title
US5367065A
(en)
*
1992-08-10
1994-11-22
The Trustees Of The University Of Pennsylvania
Constitutive triple response gene and mutations
1994
1994-06-17
US
US08/261,822
patent/US5650553A/en
not_active
Expired – Lifetime
1995
1995-06-15
CA
CA002193255A
patent/CA2193255A1/en
not_active
Abandoned
1995-06-15
WO
PCT/US1995/007744
patent/WO1995035318A1/en
not_active
Application Discontinuation
1995-06-15
EP
EP95923956A
patent/EP0763060A4/en
not_active
Withdrawn
1995-06-15
AU
AU28650/95A
patent/AU686408B2/en
not_active
Ceased
Also Published As
Publication number
Publication date
CA2193255A1
(en)
1995-12-28
EP0763060A1
(en)
1997-03-19
WO1995035318A1
(en)
1995-12-28
US5650553A
(en)
1997-07-22
EP0763060A4
(en)
1998-06-17
AU2865095A
(en)
1996-01-15
Similar Documents
Publication
Publication Date
Title
AU708081B2
(en)
1999-07-29
RPS2 gene and uses thereof
AU686408B2
(en)
1998-02-05
Plant genes for sensitivity to ethylene and pathogens
CA2459079C
(en)
2014-12-09
Plant-derived resistance gene
JPH06319558A
(en)
1994-11-22
Hypersensitivity related gene
AU697247B2
(en)
1998-10-01
Plant pathogen resistance genes and uses thereof
CN108642065B
(en)
2020-01-21
Rice endosperm aleurone related gene OsSecY2 and encoding protein and application thereof
US6433251B1
(en)
2002-08-13
Promoter regulating circadian clock function and photoperiodism
WO1994010831A1
(en)
1994-05-26
Induction of dwarfing and early flowering using group 3 lea proteins
US6225527B1
(en)
2001-05-01
Plant pathogen resistance genes and uses thereof
JP2002524044A
(en)
2002-08-06
Plant disease resistance signaling gene: Materials and methods related thereto
CN112724210A
(en)
2021-04-30
Plant amyloplast development related protein OsSSG7 and coding gene and application thereof
CA2284489A1
(en)
1998-09-24
Plant genes for sensitivity to ethylene and pathogens
CN110256543B
(en)
2022-09-20
PwNAC1 gene and application of encoding protein thereof in plant stress resistance
WO2000008189A2
(en)
2000-02-17
Plant resistance gene
CA2363686A1
(en)
2000-09-21
Ve protein and nucleic acid sequences, compositions, and methods for plant pathogen resistance
WO2005084115A2
(en)
2005-09-15
Ttg3 deficient plants, nucleic acids, polypetides and methods of use thereof
WO2002020725A2
(en)
2002-03-14
Lnp, a protein involved in the initiation of mycorrhizal infection in plants
AU760802B2
(en)
2003-05-22
Recombination repair gene, MIM, from Arabidopsis thaliana
CA2215496A1
(en)
1996-10-03
Plant pathogen resistance genes and uses thereof
WO2001020973A2
(en)
2001-03-29
Plant genes for sensitivity to ethylene and pathogens
MXPA96004739A
(en)
1998-10-23
Gene rps2 and its u
None