README.rst
Abydos
======
+------------------+------------------------------------------------------+
| CI & Test Status | |travis| |circle| |azure| |semaphore| |coveralls| |
+------------------+------------------------------------------------------+
| Code Quality | |codeclimate| |scrutinizer| |codacy| |codefactor| |
+------------------+------------------------------------------------------+
| Dependencies | |requires| |snyk| |pyup| |cii| |black| |
+------------------+------------------------------------------------------+
| Local Analysis | |pylint| |flake8| |pydocstyle| |sloccount| |mypy| |
+------------------+------------------------------------------------------+
| Usage | |docs| |mybinder| |license| |sourcerank| |zenodo| |
+------------------+------------------------------------------------------+
| Contribution | |openhub| |gh-commits| |gh-issues| |gh-stars| |
+------------------+------------------------------------------------------+
| PyPI | |pypi| |pypi-dl| |pypi-ver| |
+------------------+------------------------------------------------------+
| conda-forge | |conda| |conda-dl| |conda-platforms| |
+------------------+------------------------------------------------------+
.. |travis| image:: https://travis-ci.org/chrislit/abydos.svg?branch=master
:target: https://travis-ci.org/chrislit/abydos
:alt: Travis-CI Build Status
.. |circle| image:: https://circleci.com/gh/chrislit/abydos/tree/master.svg?style=shield
:target: https://circleci.com/gh/chrislit/abydos/tree/master
:alt: Circle-CI Build Status
.. |azure| image:: https://dev.azure.com/chrislit/abydos/_apis/build/status/chrislit.abydos?branchName=master
:target: https://dev.azure.com/chrislit/abydos/_build/latest?definitionId=1
:alt: Azure Pipelines Build Status
.. |semaphore| image:: https://semaphoreci.com/api/v1/chrislit/abydos/branches/master/shields_badge.svg
:target: https://semaphoreci.com/chrislit/abydos
:alt: Semaphore Build Status
.. |coveralls| image:: https://coveralls.io/repos/github/chrislit/abydos/badge.svg?branch=master
:target: https://coveralls.io/github/chrislit/abydos?branch=master
:alt: Coverage Status
.. |codeclimate| image:: https://codeclimate.com/github/chrislit/abydos/badges/gpa.svg
:target: https://codeclimate.com/github/chrislit/abydos
:alt: Code Climate
.. |scrutinizer| image:: https://scrutinizer-ci.com/g/chrislit/abydos/badges/quality-score.png?b=master
:target: https://scrutinizer-ci.com/g/chrislit/abydos/?branch=master
:alt: Scrutinizer
.. |codacy| image:: https://api.codacy.com/project/badge/Grade/db79f2c31ea142fb9b5938abe87b0854
:target: https://www.codacy.com/app/chrislit/abydos?utm_source=github.com&utm_medium=referral&utm_content=chrislit/abydos&utm_campaign=Badge_Grade
:alt: Codacy
.. |codefactor| image:: https://www.codefactor.io/repository/github/chrislit/abydos/badge
:target: https://www.codefactor.io/repository/github/chrislit/abydos
:alt: CodeFactor
.. |requires| image:: https://requires.io/github/chrislit/abydos/requirements.svg?branch=master
:target: https://requires.io/github/chrislit/abydos/requirements/?branch=master
:alt: Requirements Status
.. |snyk| image:: https://snyk.io/test/github/chrislit/abydos/badge.svg?targetFile=requirements.txt
:target: https://snyk.io/test/github/chrislit/abydos?targetFile=requirements.txt
:alt: Known Vulnerabilities
.. |pyup| image:: https://pyup.io/repos/github/chrislit/abydos/shield.svg
:target: https://pyup.io/repos/github/chrislit/abydos/
:alt: Updates
.. |cii| image:: https://bestpractices.coreinfrastructure.org/projects/1598/badge
:target: https://bestpractices.coreinfrastructure.org/projects/1598
:alt: CII Best Practices
.. |black| image:: https://img.shields.io/badge/code%20style-black-000000.svg
:target: https://github.com/ambv/black
:alt: black
.. |pylint| image:: https://img.shields.io/badge/Pylint-9.13/10-yellowgreen.svg
:target: #
:alt: Pylint Score
.. |flake8| image:: https://img.shields.io/badge/flake8-0-brightgreen.svg
:target: #
:alt: flake8 Errors
.. |pydocstyle| image:: https://img.shields.io/badge/pydocstyle-0-brightgreen.svg
:target: #
:alt: pydocstyle Errors
.. |sloccount| image:: https://img.shields.io/badge/SLOCCount-40,079-blue.svg
:target: #
:alt: SLOCCount
.. |mypy| image:: https://img.shields.io/badge/mypy-1.87%25%20imprecise-1F5082.svg
:target: #
:alt: mypy Imprecision
.. |docs| image:: https://readthedocs.org/projects/abydos/badge/?version=latest
:target: https://abydos.readthedocs.org/en/latest/
:alt: Documentation Status
.. |mybinder| image:: https://img.shields.io/badge/launch-binder-579aca.svg?logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAFkAAABZCAMAAABi1XidAAAB8lBMVEX///9XmsrmZYH1olJXmsr1olJXmsrmZYH1olJXmsr1olJXmsrmZYH1olL1olJXmsr1olJXmsrmZYH1olL1olJXmsrmZYH1olJXmsr1olL1olJXmsrmZYH1olL1olJXmsrmZYH1olL1olL0nFf1olJXmsrmZYH1olJXmsq8dZb1olJXmsrmZYH1olJXmspXmspXmsr1olL1olJXmsrmZYH1olJXmsr1olL1olJXmsrmZYH1olL1olLeaIVXmsrmZYH1olL1olL1olJXmsrmZYH1olLna31Xmsr1olJXmsr1olJXmsrmZYH1olLqoVr1olJXmsr1olJXmsrmZYH1olL1olKkfaPobXvviGabgadXmsqThKuofKHmZ4Dobnr1olJXmsr1olJXmspXmsr1olJXmsrfZ4TuhWn1olL1olJXmsqBi7X1olJXmspZmslbmMhbmsdemsVfl8ZgmsNim8Jpk8F0m7R4m7F5nLB6jbh7jbiDirOEibOGnKaMhq+PnaCVg6qWg6qegKaff6WhnpKofKGtnomxeZy3noG6dZi+n3vCcpPDcpPGn3bLb4/Mb47UbIrVa4rYoGjdaIbeaIXhoWHmZYHobXvpcHjqdHXreHLroVrsfG/uhGnuh2bwj2Hxk17yl1vzmljzm1j0nlX1olL3AJXWAAAAbXRSTlMAEBAQHx8gICAuLjAwMDw9PUBAQEpQUFBXV1hgYGBkcHBwcXl8gICAgoiIkJCQlJicnJ2goKCmqK+wsLC4usDAwMjP0NDQ1NbW3Nzg4ODi5+3v8PDw8/T09PX29vb39/f5+fr7+/z8/Pz9/v7+zczCxgAABC5JREFUeAHN1ul3k0UUBvCb1CTVpmpaitAGSLSpSuKCLWpbTKNJFGlcSMAFF63iUmRccNG6gLbuxkXU66JAUef/9LSpmXnyLr3T5AO/rzl5zj137p136BISy44fKJXuGN/d19PUfYeO67Znqtf2KH33Id1psXoFdW30sPZ1sMvs2D060AHqws4FHeJojLZqnw53cmfvg+XR8mC0OEjuxrXEkX5ydeVJLVIlV0e10PXk5k7dYeHu7Cj1j+49uKg7uLU61tGLw1lq27ugQYlclHC4bgv7VQ+TAyj5Zc/UjsPvs1sd5cWryWObtvWT2EPa4rtnWW3JkpjggEpbOsPr7F7EyNewtpBIslA7p43HCsnwooXTEc3UmPmCNn5lrqTJxy6nRmcavGZVt/3Da2pD5NHvsOHJCrdc1G2r3DITpU7yic7w/7Rxnjc0kt5GC4djiv2Sz3Fb2iEZg41/ddsFDoyuYrIkmFehz0HR2thPgQqMyQYb2OtB0WxsZ3BeG3+wpRb1vzl2UYBog8FfGhttFKjtAclnZYrRo9ryG9uG/FZQU4AEg8ZE9LjGMzTmqKXPLnlWVnIlQQTvxJf8ip7VgjZjyVPrjw1te5otM7RmP7xm+sK2Gv9I8Gi++BRbEkR9EBw8zRUcKxwp73xkaLiqQb+kGduJTNHG72zcW9LoJgqQxpP3/Tj//c3yB0tqzaml05/+orHLksVO+95kX7/7qgJvnjlrfr2Ggsyx0eoy9uPzN5SPd86aXggOsEKW2Prz7du3VID3/tzs/sSRs2w7ovVHKtjrX2pd7ZMlTxAYfBAL9jiDwfLkq55Tm7ifhMlTGPyCAs7RFRhn47JnlcB9RM5T97ASuZXIcVNuUDIndpDbdsfrqsOppeXl5Y+XVKdjFCTh+zGaVuj0d9zy05PPK3QzBamxdwtTCrzyg/2Rvf2EstUjordGwa/kx9mSJLr8mLLtCW8HHGJc2R5hS219IiF6PnTusOqcMl57gm0Z8kanKMAQg0qSyuZfn7zItsbGyO9QlnxY0eCuD1XL2ys/MsrQhltE7Ug0uFOzufJFE2PxBo/YAx8XPPdDwWN0MrDRYIZF0mSMKCNHgaIVFoBbNoLJ7tEQDKxGF0kcLQimojCZopv0OkNOyWCCg9XMVAi7ARJzQdM2QUh0gmBozjc3Skg6dSBRqDGYSUOu66Zg+I2fNZs/M3/f/Grl/XnyF1Gw3VKCez0PN5IUfFLqvgUN4C0qNqYs5YhPL+aVZYDE4IpUk57oSFnJm4FyCqqOE0jhY2SMyLFoo56zyo6becOS5UVDdj7Vih0zp+tcMhwRpBeLyqtIjlJKAIZSbI8SGSF3k0pA3mR5tHuwPFoa7N7reoq2bqCsAk1HqCu5uvI1n6JuRXI+S1Mco54YmYTwcn6Aeic+kssXi8XpXC4V3t7/ADuTNKaQJdScAAAAAElFTkSuQmCC
:target: https://mybinder.org/v2/gh/chrislit/abydos/master?filepath=binder
:alt: Binder
.. |license| image:: https://img.shields.io/badge/License-GPL%20v3+-blue.svg?logo=gnu
:target: https://www.gnu.org/licenses/gpl-3.0
:alt: License: GPL v3.0+
.. |sourcerank| image:: https://img.shields.io/librariesio/sourcerank/pypi/abydos.svg
:target: https://libraries.io/pypi/abydos
:alt: Libraries.io SourceRank
.. |zenodo| image:: https://zenodo.org/badge/DOI/10.5281/zenodo.3603514.svg
:target: https://doi.org/10.5281/zenodo.3603514
:alt: Zenodo
.. |openhub| image:: https://www.openhub.net/p/abydosnlp/widgets/project_thin_badge.gif
:target: https://www.openhub.net/p/abydosnlp
:alt: OpenHUB
.. |gh-commits| image:: https://img.shields.io/github/commit-activity/y/chrislit/abydos.svg?logo=github
:target: https://github.com/chrislit/abydos/graphs/commit-activity
:alt: GitHub Commits
.. |gh-issues| image:: https://img.shields.io/github/issues-closed/chrislit/abydos.svg?logo=github
:target: https://github.com/chrislit/abydos/issues?q=
:alt: GitHub Issues Closed
.. |gh-stars| image:: https://img.shields.io/github/stars/chrislit/abydos.svg?logo=github
:target: https://github.com/chrislit/abydos/stargazers
:alt: GitHub Stars
.. |pypi| image:: https://img.shields.io/pypi/v/abydos.svg?logo=python&logoColor=white
:target: https://pypi.python.org/pypi/abydos
:alt: PyPI
.. |pypi-dl| image:: https://img.shields.io/pypi/dm/abydos.svg?logo=python&logoColor=white
:target: https://pypi.python.org/pypi/abydos
:alt: PyPI downloads/month
.. |pypi-ver| image:: https://img.shields.io/pypi/pyversions/abydos.svg?logo=python&logoColor=white
:target: https://pypi.python.org/pypi/abydos
:alt: PyPI versions
.. |conda| image:: https://img.shields.io/conda/vn/conda-forge/abydos.svg?logo=conda-forge
:target: https://anaconda.org/conda-forge/abydos
:alt: conda-forge
.. |conda-dl| image:: https://img.shields.io/conda/dn/conda-forge/abydos.svg?logo=conda-forge
:target: https://anaconda.org/conda-forge/abydos
:alt: conda-forge downloads
.. |conda-platforms| image:: https://img.shields.io/conda/pn/conda-forge/abydos.svg?logo=conda-forge
:target: https://anaconda.org/conda-forge/abydos
:alt: conda-forge platforms
|
.. image:: https://raw.githubusercontent.com/chrislit/abydos/master/abydos-small.png
:target: https://github.com/chrislit/abydos
:alt: abydos
:align: right
|
| `Abydos NLP/IR library <https://github.com/chrislit/abydos>`_
| Copyright 2014-2020 by Christopher C. Little
Abydos is a library of phonetic algorithms, string distance measures & metrics,
stemmers, and string fingerprinters including:
- Phonetic algorithms
- Robert C. Russell's Index
- American Soundex
- Refined Soundex
- Daitch-Mokotoff Soundex
- Kölner Phonetik
- NYSIIS
- Match Rating Algorithm
- Metaphone
- Double Metaphone
- Caverphone
- Alpha Search Inquiry System
- Fuzzy Soundex
- Phonex
- Phonem
- Phonix
- SfinxBis
- phonet
- Standardized Phonetic Frequency Code
- Statistics Canada
- Lein
- Roger Root
- Oxford Name Compression Algorithm (ONCA)
- Eudex phonetic hash
- Haase Phonetik
- Reth-Schek Phonetik
- FONEM
- Parmar-Kumbharana
- Davidson's Consonant Code
- SoundD
- PSHP Soundex/Viewex Coding
- an early version of Henry Code
- Norphone
- Dolby Code
- Phonetic Spanish
- Spanish Metaphone
- MetaSoundex
- SoundexBR
- NRL English-to-phoneme
- Beider-Morse Phonetic Matching
- String distance metrics
- Levenshtein distance
- Optimal String Alignment distance
- Levenshtein-Damerau distance
- Hamming distance
- Tversky index
- Sørensen–Dice coefficient & distance
- Jaccard similarity coefficient & distance
- overlap similarity & distance
- Tanimoto coefficient & distance
- Minkowski distance & similarity
- Manhattan distance & similarity
- Euclidean distance & similarity
- Chebyshev distance
- cosine similarity & distance
- Jaro distance
- Jaro-Winkler distance (incl. the strcmp95 algorithm variant)
- Longest common substring
- Ratcliff-Obershelp similarity & distance
- Match Rating Algorithm similarity
- Normalized Compression Distance (NCD) & similarity
- Monge-Elkan similarity & distance
- Matrix similarity
- Needleman-Wunsch score
- Smith-Waterman score
- Gotoh score
- Length similarity
- Prefix, Suffix, and Identity similarity & distance
- Modified Language-Independent Product Name Search (MLIPNS) similarity &
distance
- Bag distance
- Editex distance
- Eudex distances
- Sift4 distance
- Baystat distance & similarity
- Typo distance
- Indel distance
- Synoname
- Stemmers
- the Lovins stemmer
- the Porter and Porter2 (Snowball English) stemmers
- Snowball stemmers for German, Dutch, Norwegian, Swedish, and Danish
- CLEF German, German plus, and Swedish stemmers
- Caumann's German stemmer
- UEA-Lite Stemmer
- Paice-Husk Stemmer
- Schinke Latin stemmer
- S stemmer
- String Fingerprints
- string fingerprint
- q-gram fingerprint
- phonetic fingerprint
- Pollock & Zomora's skeleton key
- Pollock & Zomora's omission key
- Cisłak & Grabowski's occurrence fingerprint
- Cisłak & Grabowski's occurrence halved fingerprint
- Cisłak & Grabowski's count fingerprint
- Cisłak & Grabowski's position fingerprint
- Synoname Toolcode
-----
Installation
============
Required libraries:
- NumPy
- deprecation
Optional libraries (all available on PyPI, some available on conda or
conda-forge):
- `SyllabiPy <http://syllabipy.com/>`_
- `NLTK <https://www.nltk.org/>`_
- `PyLZSS <https://github.com/rumbah/pylzss>`_
- `paq <https://github.com/observerss/paq>`_
To install Abydos (master) from Github source::
git clone https://github.com/chrislit/abydos.git --recursive
cd abydos
python setup install
If your default python command calls Python 2.7 but you want to install for
Python 3, you may instead need to call::
python3 setup install
To install Abydos (latest release) from PyPI using pip::
pip install abydos
To install from `conda-forge <https://anaconda.org/conda-forge/abydos>`_::
conda install abydos
It should run on Python 3.5-3.8.
Testing & Contributing
======================
To run the whole test-suite just call tox::
tox
The tox setup has the following environments: black, py37, doctest,
regression, fuzz, pylint, pydocstyle, flake8, doc8, docs, sloccount, badges, &
build. So if you only want to generate documentation (in HTML, EPUB, & PDF
formats), just call::
tox -e docs
In order to only run & generate Flake8 reports, call::
tox -e flake8
Contributions such as bug reports, PRs, suggestions, desired new features, etc.
are welcome through Github
`Issues <https://github.com/chrislit/abydos/issues>`_ &
`Pull requests <https://github.com/chrislit/abydos/pulls>`_.