Our project aims at developing an API which takes a text and generates a summary out of it. The tools that are required for the project are built from scratch to provide flexibility and customizability. And the text summarization is achieved by the extractive text summarization using TF-IDF. Where the summaries are generated by identifying the important sections of the original text and extracting them. By using this API one could summarize huge chunks of text or emails or even a text file.

Getting started :

  1. Installing the necessary requirements. If you are on windows then use cmd, or in ubuntu use terminal. And use the following commands one by one to install the numpy, pandas and nltk libraries.

    pip install numpy          &

    pip install pandas        &

    pip install nltk

  2. Clone the github repository using the command below in cmd or terminal and /saramsh_package is the package directory.

    git clone

  3. Copy the /saramsh_package directory into your project directory.

                        your project directory/
                        ├── saramsh_package/
                        │   ├──
                        │   ├── 

  4. Now import the saramsh using line below in your application.

    >>> from saramsh_package.saramsh import Saramsh

  5. Now create an object for the saramsh class using the line below and pass the data and title as strings in file.

    >>> sm = Saramsh(data , title)

  6. Now using the object created, call summarize() method. Which will print threshold value followed by title and summary.

    >>> sm.summarize()

  7. To understand how the summarize() method works, open this link [click here]