We provide a Perl script to access the web services and parse the output data. The input of this script is a file containing all query proteins in FASTA format. This script can query numerous proteins per run. A sample input and a sample output are also presented in this tarball.
We collected human proteins acetylated by CBP/p300, GCN5/PCAF, MYST families and human proteins deacetylated by Class I HDAC, SIRT1 though searching the PubMed literature using keywords. Papers and related references were examined, and the papers with identified acetylation sites and KAT information were selected. The acetylated proteins were extracted and mapped to the UniProt Database to retrieve their Swiss-Prot accession numbers. The acetylated sites were reviewed carefully to ensure that the acetylated position was the exact position mentioned in the literature.
The ASEB R package (developer version) is available at Bioconductor (>= 2.10) ASEB@Bioconductor. Bioconductor is an open source, open development software project that provides tools for the analysis and comprehension of high-throughput genomic data. This program is based primarily on the R programming language.
The Python package is available here. A local version is provided to help researchers to do offline large amount of prediction with the optimized method, which can also be applied to other post-translational modifications prediction.