summarize()

Parameters :

data : raw text data
All the pre-processing is taken care by the summarize() itself.
title : raw title of the provided data
Title is currently not used in generating summary.

Attributes :

tf_ : term frequency [ndArray]
Matrix containing how many times each word repeated in a sentence.And the values are normalized by dividing with length of unique words in each sentence.
idf_ : inverse document frequency [ndArray]
Single row matrix containing idf scores.
tf_idf_ : TF*IDF matrix [ndArray]
Matrix obtained by multiplying tf into idf matrices.
sentenceScores_ : scores of each sentence [list]
Scores of each sentence in the given data.
summary : generated summary [string]
The summary generated using TF-IDF scores.

Notes :

TF : Term frequency
Term_frequency = (Number of times a term 't' occurs in a sentence) / (Total number of unique words in that sentence)
IDF : Inverse document frequency
Inverse_document_frequency = log((Total number of documents)+1 / (Number of documents in which term 't' appears)+1)
There are other schemes for both TF and IDF, which can be understood from here : [click here]

Sample code :

Below is simple starter code :

Data was taken from this article : [click here for article]
This is an inshorts summary of above article : [click here fro inshort summary]


                >>> from saramsh_package.saramsh import Saramsh

                >>> sm = Saramsh(data , title)

                >>> sm.summarize()

                0.9810933498889896


                Divya Dutta: ‘I once lost a role because I was told I am fair’ 


                She wants to do your role.Earlier, we used to say chhota role nahi karenge.I never said that, but I have seen girls say it.Now they say chhota hai, koi baat nahi, impactful hai na?Divya herself had to colour her skin for Delhi 6.They were looking for a village woman who is darker.They told me you suit the role completely.I asked then why am I not doing it?They replied you are too fair for the role.But I am an actor.I was darkened for Delhi 6.

saramsh.summarize(data, title) [source]

Parameters :

Attributes :

Notes :

Methods :

A typical workflow :

Sample code :