FUTURE DEEP LEARNING TECHNOLOGY:
AMINO ACID SEQUENCE GENERATION
Existing LSTM architectures do not have high enough precision to reliably predict long amino acid sequences. If incorporated into LyseDevice, this may significantly alter the efficacy of the synthesized lysins if even a single misclassification is made.
For LyseDevice to become a feasible technology, high-accuracy sequence-to-sequence deep learning techniques must be developed.
LyseDevice uses sequence-to-sequence long-short term memory networks (LSTMs), recurrent neural networks used in machine translation that can generate output sequences of arbitrary lengths.
​
The LSTM is trained on the publicly available Actinobacteriophage Database, which contains 18,000+ lysins and their respective bacterial host cells, to predict the amino acid sequence of the theoretically optimal lysin suited to treat an infection from its bacterial genome.
​
Even when encountering a new bacteria without an existing lysin treatment, sequence generation allows LyseDevice to create a novel lysin to target the bacteria.