The current summarize() module uses TF_IDF scoring mechanism to summarize the given raw text data.

saramsh.summarize(data, title) [source]



Parameters :
  • data : raw text data

    All the pre-processing is taken care by the summarize() itself.

  • title : raw title of the provided data

    Title is currently not used in generating summary.

Attributes :
  • tf_ : term frequency [ndArray]

    Matrix containing how many times each word repeated in a sentence.And the values are normalized by dividing with length of unique words in each sentence.

  • idf_ : inverse document frequency [ndArray]

    Single row matrix containing idf scores.

  • tf_idf_ : TF*IDF matrix [ndArray]

    Matrix obtained by multiplying tf into idf matrices.

  • sentenceScores_ : scores of each sentence [list]

    Scores of each sentence in the given data.

  • summary : generated summary [string]

    The summary generated using TF-IDF scores.



Notes :
  • TF : Term frequency

    Term_frequency = (Number of times a term 't' occurs in a sentence) / (Total number of unique words in that sentence)

  • IDF : Inverse document frequency

    Inverse_document_frequency = log((Total number of documents)+1 / (Number of documents in which term 't' appears)+1)

  • There are other schemes for both TF and IDF, which can be understood from here : [click here]


Methods :

Here is a class diagram, containing all the methods within the saramsh package.

methods


A typical workflow :

Below is an workflow of how the module generates summary using the given data.

View
methods


Sample code :

Below is simple starter code :


Data was taken from this article : [click here for article]
This is an inshorts summary of above article : [click here fro inshort summary]

>>> from saramsh_package.saramsh import Saramsh
>>> sm = Saramsh(data , title)
>>> sm.summarize()
0.9810933498889896

Divya Dutta: ‘I once lost a role because I was told I am fair’

She wants to do your role.Earlier, we used to say chhota role nahi karenge.I never said that, but I have seen girls say it.Now they say chhota hai, koi baat nahi, impactful hai na?Divya herself had to colour her skin for Delhi 6.They were looking for a village woman who is darker.They told me you suit the role completely.I asked then why am I not doing it?They replied you are too fair for the role.But I am an actor.I was darkened for Delhi 6.