Bioencoder

An amino acid sequence encoding toolbox for machine learning.

Introduction

main features:

Machine learning oriented
Rich encoding varieties (NUM, BE, EAAC, AAINDEX, GACC, CKSAAP)
Native support for big size fasta format
Out-of-the-box

Installation Tutorial

Via PIP

python setup.py bdist_wheel
pip install ./dist/bioencoder-1.0.0-py3-none-any.whl

Usage

Reading from a Fasta File

A standard fasta file like:

>1|1
DGMRITLRDGCIVHLRASGNAPELRCYAEANLLNRAQDLVNTTLANIKKRC
>2|1
EGKLSMLQNTIKRLASLSTEEPVVICNDRHRFLVAEQLREIDKLANNIILE

To read and process the sequences to EEAC Embedding:

from bioencoder import *
pos_data = "pos.fasta"
window_size = 12
pos_seqList,pos_labellist,pos_seqNamelist=get_data(pos_data,1,method="GAAC",window_size=window_size)

Reading from a raw sequence

For example, A Str likestr='DGMRITLRDGCIVHLRASGNAPELRCYAEANLLNRAQDLVNTTLANIKKRC', Using bellow code to encoding the sequence to EAAC Embeding:

from bioencoder.encoder import *
print(EAAC(seq,window=5))

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
bioencoder		bioencoder
examples		examples
.gitignore		.gitignore
README.md		README.md
README_cn.md		README_cn.md
example.ipynb		example.ipynb
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bioencoder

Introduction

Installation Tutorial

Via PIP

Usage

Reading from a Fasta File

Reading from a raw sequence

About

Uh oh!

Releases

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Bioencoder

Introduction

Installation Tutorial

Via PIP

Usage

Reading from a Fasta File

Reading from a raw sequence

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Contributors

Uh oh!

Languages