PyPI - noshot - Versions diffs - 0.1.7__py3-none-any.whl → 0.1.8__py3-none-any.whl - Mend

noshot 0.1.7py3-none-any.whl → 0.1.8py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (228) hide show

noshot/data/ML TS XAI/NLP/NLP 5/sample1.txt DELETED Viewed

@@ -1,15 +0,0 @@
-### The Impact of Social Media on Modern Communication
-In the digital age, social media has revolutionized the way people communicate, offering unprecedented access to information and creating new ways to interact. Platforms like Facebook, Twitter, Instagram, and TikTok have connected individuals from across the globe, allowing for the instant exchange of ideas, images, and experiences. However, the rise of social media has also raised significant concerns about its impact on human relationships, mental health, and societal dynamics. This essay explores the positive and negative effects of social media on modern communication.
-On the positive side, social media has made communication more convenient and accessible than ever before. In the past, staying in touch with friends and family required physical mail, phone calls, or face-to-face interactions. Now, platforms like Facebook and WhatsApp allow people to send messages, share updates, and make video calls at any time, from anywhere in the world. This has facilitated long-distance relationships, strengthened bonds among friends and family, and made it easier to stay connected with people who may otherwise be difficult to reach.
-Moreover, social media has democratized communication, allowing individuals to express their opinions and ideas to a global audience. This has had a profound effect on activism and social movements. For example, platforms like Twitter and Instagram have played crucial roles in raising awareness about issues such as climate change, racial injustice, and political corruption. Activists can mobilize support, organize protests, and share important information in real time. The viral nature of social media also means that messages can reach millions of people in a matter of hours, making it an invaluable tool for social change.
-However, social media's influence is not entirely positive. One of the primary concerns is the effect it has on face-to-face communication skills. As people spend more time interacting online, they may become less adept at having meaningful in-person conversations. Social media interactions tend to be more superficial, with users often relying on emojis, likes, or short messages rather than engaging in deep, thoughtful discussions. This can result in a decline in the quality of personal relationships, as online communication often lacks the nuances and emotional depth found in face-to-face conversations.
-Another issue is the impact of social media on mental health. Studies have shown that excessive use of platforms like Instagram and Facebook can lead to feelings of isolation, anxiety, and depression. Constant comparison to others, especially when viewing curated, idealized images of other people's lives, can lead to low self-esteem and body image issues. The pressure to present a perfect life online, coupled with the fear of missing out (FOMO), can also contribute to heightened stress and dissatisfaction. Additionally, cyberbullying and online harassment have become increasingly prevalent, leading to harmful consequences for individuals, particularly teenagers.
-Furthermore, social media can exacerbate the spread of misinformation. Fake news, conspiracy theories, and misleading content can spread rapidly across platforms, influencing public opinion and distorting perceptions of reality. The algorithms that govern social media platforms often prioritize content that generates engagement, meaning sensational or controversial material is more likely to be shared and seen by a wide audience. This can create echo chambers where individuals are exposed only to information that confirms their existing beliefs, reinforcing polarization and division in society.
-In conclusion, social media has undeniably transformed modern communication, making it easier to connect with others and share information on a global scale. However, its impact on face-to-face interactions, mental health, and the spread of misinformation presents significant challenges. As social media continues to evolve, it is crucial that users and society as a whole strike a balance, using these platforms in ways that enhance communication while minimizing their negative effects.

noshot/data/ML TS XAI/NLP/NLP 5/sample2.txt DELETED Viewed

@@ -1,4 +0,0 @@
-chair put coat, back Furniture
-chair IT department Furniture
-where here put chair Furniture
-CSE chair head Position

noshot/data/ML TS XAI/NLP/NLP 5/word2vec_model.bin DELETED Viewed

Binary file

noshot/data/ML TS XAI/NLP/NLP 6/1. Tokenize, Tagging, NER, Parse Tree.ipynb DELETED Viewed

@@ -1,312 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "1a144d85-fe3b-4392-854b-f194b5583f23",
-   "metadata": {},
-   "source": [
-    "# Experiment 1 :\n",
-    "<b>Perform the following task using NLTK: Tokenize and tag some text, identify named entities, display a parse tree and find the ambiguity of the sentence using parse tree</b>"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "id": "46410704-680c-4baa-b3d4-33645fa19ddb",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "--------------------------------------------------\n",
-      "TOKENIZED SENTENCES\n",
-      "--------------------------------------------------\n",
-      "1. [Emma by Jane Austen 1816]  VOLUME I  CHAPTER I   Emma Woodhouse, handsome, clever, and rich, with a comfortable home and happy disposition, seemed to unite some of the best blessings of existence; and had lived nearly twenty-one years in the world with very little to distress or vex her.\n",
-      "2. She was the youngest of the two daughters of a most affectionate, indulgent father; and had, in consequence of her sister's marriage, been mistress of his house from a very early period.\n",
-      "3. Her mother had died too long ago for her to have more than an indistinct remembrance of her caresses; and her place had been supplied by an excellent woman as governess, who had fallen little short of a mother in affection.\n",
-      "4. Sixteen years had Miss Taylor been in Mr. Woodhouse's family, less as a governess than a friend, very fond of both daughters, but particularly of Emma.\n",
-      "5. Between _them_ it was more the intimacy of sisters.\n",
-      "6. Even before Miss Taylor had ceased to hold the nominal office of governess, the mildness\n",
-      "\n",
-      "--------------------------------------------------\n",
-      "TOKENIZED WORDS\n",
-      "--------------------------------------------------\n",
-      "['Emma', 'by', 'Jane', 'Austen', '1816', 'VOLUME', 'I', 'CHAPTER', 'I', 'Emma', 'Woodhouse', 'handsome', 'clever', 'and', 'rich', 'with', 'a', 'comfortable', 'home', 'and', 'happy', 'disposition', 'seemed', 'to', 'unite', 'some', 'of', 'the', 'best', 'blessings', 'of', 'existence', 'and', 'had', 'lived', 'nearly', 'twentyone', 'years', 'in', 'the', 'world', 'with', 'very', 'little', 'to', 'distress', 'or', 'vex', 'her']\n",
-      "['She', 'was', 'the', 'youngest', 'of', 'the', 'two', 'daughters', 'of', 'a', 'most', 'affectionate', 'indulgent', 'father', 'and', 'had', 'in', 'consequence', 'of', 'her', 'sisters', 'marriage', 'been', 'mistress', 'of', 'his', 'house', 'from', 'a', 'very', 'early', 'period']\n",
-      "['Her', 'mother', 'had', 'died', 'too', 'long', 'ago', 'for', 'her', 'to', 'have', 'more', 'than', 'an', 'indistinct', 'remembrance', 'of', 'her', 'caresses', 'and', 'her', 'place', 'had', 'been', 'supplied', 'by', 'an', 'excellent', 'woman', 'as', 'governess', 'who', 'had', 'fallen', 'little', 'short', 'of', 'a', 'mother', 'in', 'affection']\n",
-      "['Sixteen', 'years', 'had', 'Miss', 'Taylor', 'been', 'in', 'Mr', 'Woodhouses', 'family', 'less', 'as', 'a', 'governess', 'than', 'a', 'friend', 'very', 'fond', 'of', 'both', 'daughters', 'but', 'particularly', 'of', 'Emma']\n",
-      "['Between', 'them', 'it', 'was', 'more', 'the', 'intimacy', 'of', 'sisters']\n",
-      "['Even', 'before', 'Miss', 'Taylor', 'had', 'ceased', 'to', 'hold', 'the', 'nominal', 'office', 'of', 'governess', 'the', 'mildness']\n",
-      "\n",
-      "--------------------------------------------------\n",
-      "SENTENCES WITHOUT STOPWORDS\n",
-      "--------------------------------------------------\n",
-      "1. Emma Jane Austen 1816 VOLUME CHAPTER Emma Woodhouse handsome clever rich comfortable home happy disposition seemed unite best blessings existence lived nearly twentyone years world little distress vex\n",
-      "2. youngest two daughters affectionate indulgent father consequence sisters marriage mistress house early period\n",
-      "3. mother died long ago indistinct remembrance caresses place supplied excellent woman governess fallen little short mother affection\n",
-      "4. Sixteen years Miss Taylor Mr Woodhouses family less governess friend fond daughters particularly Emma\n",
-      "5. intimacy sisters\n",
-      "6. Even Miss Taylor ceased hold nominal office governess mildness\n",
-      "\n",
-      "--------------------------------------------------\n",
-      "SENTENCES AFTER STEMMING\n",
-      "--------------------------------------------------\n",
-      "1. emma by jane austen 1816 volum i chapter i emma woodhous handsom clever and rich with a comfort home and happi disposit seem to unit some of the best bless of exist and had live nearli twentyon year in the world with veri littl to distress or vex her\n",
-      "2. she wa the youngest of the two daughter of a most affection indulg father and had in consequ of her sister marriag been mistress of hi hous from a veri earli period\n",
-      "3. her mother had die too long ago for her to have more than an indistinct remembr of her caress and her place had been suppli by an excel woman as gover who had fallen littl short of a mother in affect\n",
-      "4. sixteen year had miss taylor been in mr woodhous famili less as a gover than a friend veri fond of both daughter but particularli of emma\n",
-      "5. between them it wa more the intimaci of sister\n",
-      "6. even befor miss taylor had ceas to hold the nomin offic of gover the mild\n",
-      "\n",
-      "--------------------------------------------------\n",
-      "SENTENCES AFTER LEMMATIZATION\n",
-      "--------------------------------------------------\n",
-      "1. Emma by Jane Austen 1816 VOLUME I CHAPTER I Emma Woodhouse handsome clever and rich with a comfortable home and happy disposition seemed to unite some of the best blessing of existence and had lived nearly twentyone year in the world with very little to distress or vex her\n",
-      "2. She wa the youngest of the two daughter of a most affectionate indulgent father and had in consequence of her sister marriage been mistress of his house from a very early period\n",
-      "3. Her mother had died too long ago for her to have more than an indistinct remembrance of her caress and her place had been supplied by an excellent woman a governess who had fallen little short of a mother in affection\n",
-      "4. Sixteen year had Miss Taylor been in Mr Woodhouses family le a a governess than a friend very fond of both daughter but particularly of Emma\n",
-      "5. Between them it wa more the intimacy of sister\n",
-      "6. Even before Miss Taylor had ceased to hold the nominal office of governess the mildness\n"
-     ]
-    }
-   ],
-   "source": [
-    "import nltk\n",
-    "from nltk import sent_tokenize, word_tokenize, pos_tag, ne_chunk\n",
-    "from nltk.corpus import gutenberg, stopwords\n",
-    "from nltk.stem import PorterStemmer, WordNetLemmatizer\n",
-    "from nltk.tree import Tree\n",
-    "from nltk.parse import RecursiveDescentParser\n",
-    "import string\n",
-    "\n",
-    "sentences = gutenberg.raw('austen-emma.txt')[:999]\n",
-    "\n",
-    "#Tokenization : using nltk.sent_tokenize, nltk.word_tokenize\n",
-    "sent_tokens = sent_tokenize(sentences)\n",
-    "print('-'*50 + \"\\nTOKENIZED SENTENCES\\n\" + '-'*50)\n",
-    "for i, sentence in enumerate(sent_tokens) :\n",
-    "    sentence = sentence.replace('\\n', ' ')\n",
-    "    print(rf\"{i+1}. {sentence}\")\n",
-    "\n",
-    "word_tokens = []; print()\n",
-    "print('-'*50 + \"\\nTOKENIZED WORDS\\n\" + '-'*50)\n",
-    "for sentence in sent_tokens :\n",
-    "    translator = str.maketrans('', '', string.punctuation)\n",
-    "    sentence = sentence.translate(translator)\n",
-    "    word_tokens.append(word_tokenize(sentence))\n",
-    "    print(word_tokens[-1], end = '\\n')\n",
-    "\n",
-    "#Removal of stopwords : using nltk.corpus.stopwords\n",
-    "stops = set(stopwords.words('english')); print()\n",
-    "print('-'*50 + \"\\nSENTENCES WITHOUT STOPWORDS\\n\" + '-'*50)\n",
-    "for i, tokens in enumerate(word_tokens) :\n",
-    "    sentence = ' '.join(word for word in tokens if word.lower() not in stops)\n",
-    "    print(rf\"{i+1}. {sentence}\")\n",
-    "\n",
-    "#Stemming : using nltk.stem.PorterStemmer\n",
-    "stemmer = PorterStemmer(); print()\n",
-    "print('-'*50 + \"\\nSENTENCES AFTER STEMMING\\n\" + '-'*50)\n",
-    "for i, tokens in enumerate(word_tokens) :\n",
-    "    sentence = ' '.join(stemmer.stem(word) for word in tokens)\n",
-    "    print(rf\"{i+1}. {sentence}\")\n",
-    "\n",
-    "#Lemmatization : using nltk.stem.WordNetLemmatizer\n",
-    "lemmatizer = WordNetLemmatizer(); print()\n",
-    "print('-'*50 + \"\\nSENTENCES AFTER LEMMATIZATION\\n\" + '-'*50)\n",
-    "for i, tokens in enumerate(word_tokens) :\n",
-    "    sentence = ' '.join(lemmatizer.lemmatize(word) for word in tokens)\n",
-    "    print(rf\"{i+1}. {sentence}\")\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "id": "a8f0edc0-167d-4c7a-bbce-530c5a146861",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "--------------------------------------------------\n",
-      "TAGGED TOKENS\n",
-      "--------------------------------------------------\n",
-      "1. [('Emma', 'NN'), ('by', 'IN'), ('Jane', 'NNP'), ('Austen', 'NNP'), ('1816', 'CD'), ('VOLUME', 'NNP'), ('I', 'PRP'), ('CHAPTER', 'VBP'), ('I', 'PRP'), ('Emma', 'NNP'), ('Woodhouse', 'NNP'), ('handsome', 'VBD'), ('clever', 'NN'), ('and', 'CC'), ('rich', 'JJ'), ('with', 'IN'), ('a', 'DT'), ('comfortable', 'JJ'), ('home', 'NN'), ('and', 'CC'), ('happy', 'JJ'), ('disposition', 'NN'), ('seemed', 'VBD'), ('to', 'TO'), ('unite', 'VB'), ('some', 'DT'), ('of', 'IN'), ('the', 'DT'), ('best', 'JJS'), ('blessings', 'NNS'), ('of', 'IN'), ('existence', 'NN'), ('and', 'CC'), ('had', 'VBD'), ('lived', 'VBN'), ('nearly', 'RB'), ('twentyone', 'CD'), ('years', 'NNS'), ('in', 'IN'), ('the', 'DT'), ('world', 'NN'), ('with', 'IN'), ('very', 'RB'), ('little', 'JJ'), ('to', 'TO'), ('distress', 'VB'), ('or', 'CC'), ('vex', 'VB'), ('her', 'PRP$')]\n",
-      "\n",
-      "2. [('She', 'PRP'), ('was', 'VBD'), ('the', 'DT'), ('youngest', 'JJS'), ('of', 'IN'), ('the', 'DT'), ('two', 'CD'), ('daughters', 'NNS'), ('of', 'IN'), ('a', 'DT'), ('most', 'RBS'), ('affectionate', 'JJ'), ('indulgent', 'NN'), ('father', 'NN'), ('and', 'CC'), ('had', 'VBD'), ('in', 'IN'), ('consequence', 'NN'), ('of', 'IN'), ('her', 'PRP$'), ('sisters', 'NNS'), ('marriage', 'VBP'), ('been', 'VBN'), ('mistress', 'NN'), ('of', 'IN'), ('his', 'PRP$'), ('house', 'NN'), ('from', 'IN'), ('a', 'DT'), ('very', 'RB'), ('early', 'JJ'), ('period', 'NN')]\n",
-      "\n",
-      "3. [('Her', 'PRP$'), ('mother', 'NN'), ('had', 'VBD'), ('died', 'VBN'), ('too', 'RB'), ('long', 'RB'), ('ago', 'RB'), ('for', 'IN'), ('her', 'PRP$'), ('to', 'TO'), ('have', 'VB'), ('more', 'JJR'), ('than', 'IN'), ('an', 'DT'), ('indistinct', 'JJ'), ('remembrance', 'NN'), ('of', 'IN'), ('her', 'PRP$'), ('caresses', 'NNS'), ('and', 'CC'), ('her', 'PRP$'), ('place', 'NN'), ('had', 'VBD'), ('been', 'VBN'), ('supplied', 'VBN'), ('by', 'IN'), ('an', 'DT'), ('excellent', 'JJ'), ('woman', 'NN'), ('as', 'IN'), ('governess', 'NN'), ('who', 'WP'), ('had', 'VBD'), ('fallen', 'VBN'), ('little', 'JJ'), ('short', 'JJ'), ('of', 'IN'), ('a', 'DT'), ('mother', 'NN'), ('in', 'IN'), ('affection', 'NN')]\n",
-      "\n",
-      "4. [('Sixteen', 'JJ'), ('years', 'NNS'), ('had', 'VBD'), ('Miss', 'NNP'), ('Taylor', 'NNP'), ('been', 'VBN'), ('in', 'IN'), ('Mr', 'NNP'), ('Woodhouses', 'NNP'), ('family', 'NN'), ('less', 'CC'), ('as', 'IN'), ('a', 'DT'), ('governess', 'NN'), ('than', 'IN'), ('a', 'DT'), ('friend', 'JJ'), ('very', 'RB'), ('fond', 'NN'), ('of', 'IN'), ('both', 'DT'), ('daughters', 'NNS'), ('but', 'CC'), ('particularly', 'RB'), ('of', 'IN'), ('Emma', 'NNP')]\n",
-      "\n",
-      "5. [('Between', 'IN'), ('them', 'PRP'), ('it', 'PRP'), ('was', 'VBD'), ('more', 'RBR'), ('the', 'DT'), ('intimacy', 'NN'), ('of', 'IN'), ('sisters', 'NNS')]\n",
-      "\n",
-      "6. [('Even', 'RB'), ('before', 'IN'), ('Miss', 'NNP'), ('Taylor', 'NNP'), ('had', 'VBD'), ('ceased', 'VBN'), ('to', 'TO'), ('hold', 'VB'), ('the', 'DT'), ('nominal', 'JJ'), ('office', 'NN'), ('of', 'IN'), ('governess', 'NN'), ('the', 'DT'), ('mildness', 'NN')]\n",
-      "\n"
-     ]
-    }
-   ],
-   "source": [
-    "#Tagging tokens with its parts of speech: using nltk.pos_tag\n",
-    "print('-'*50 + \"\\nTAGGED TOKENS\\n\" + '-'*50)\n",
-    "tags = []\n",
-    "for i, tokens in enumerate(word_tokens) :\n",
-    "    tags += [pos_tag(tokens)] \n",
-    "    print(f\"{i+1}. {tags[-1]}\\n\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "id": "d97e2c48-7b41-4b09-a429-6f81dd2644f8",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "--------------------------------------------------\n",
-      "NAMED ENTITIES\n",
-      "--------------------------------------------------\n",
-      "ENTITY : Emma, LABEL : GPE\n",
-      "ENTITY : Jane Austen, LABEL : PERSON\n",
-      "ENTITY : Emma Woodhouse, LABEL : PERSON\n",
-      "ENTITY : Miss Taylor, LABEL : PERSON\n",
-      "ENTITY : Emma, LABEL : GPE\n",
-      "ENTITY : Miss Taylor, LABEL : PERSON\n"
-     ]
-    }
-   ],
-   "source": [
-    "#Named entities recogonization : using nltk.ne_chunk\n",
-    "print('-'*50 + \"\\nNAMED ENTITIES\\n\" + '-'*50)\n",
-    "ne = []\n",
-    "for i, tag in enumerate(tags) :\n",
-    "    ne += [ne_chunk(tag)] \n",
-    "    for subtree in ne[-1] :\n",
-    "        if isinstance(subtree, Tree) :\n",
-    "            words = [word for word,tag in subtree.leaves()]\n",
-    "            label = subtree.label()\n",
-    "            print(f\"ENTITY : {' '.join(words)}, LABEL : {label}\")\n",
-    "    "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "id": "21223618-2544-42ca-b6b5-61ad138b7d3c",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "#To extract the list of all tags in nltk\n",
-    "# tagdict = nltk.data.load('help/tagsets/upenn_tagset.pickle')"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "id": "b5849635-4186-40d4-95d6-539226f04fb5",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import nltk\n",
-    "from nltk import word_tokenize\n",
-    "from nltk.parse import RecursiveDescentParser"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "id": "5cc00f51-fb75-49f2-ab56-b01503e2eef2",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "PARSE TREES : \n",
-      "                      S                             \n",
-      "      ________________|_______                       \n",
-      "     |                        VP                    \n",
-      "     |             ___________|_________             \n",
-      "     |            |       |             PP          \n",
-      "     |            |       |         ____|___         \n",
-      "     NP           |       NP       |        NP      \n",
-      "  ___|_____       |    ___|___     |     ___|____    \n",
-      " AT       NNS     V   AT      NN   IN   AT       NN \n",
-      " |         |      |   |       |    |    |        |   \n",
-      "The     children ate the     cake with  a      spoon\n",
-      "\n",
-      "                      S                         \n",
-      "      ________________|___                       \n",
-      "     |                    VP                    \n",
-      "     |             _______|____                  \n",
-      "     |            |            NP               \n",
-      "     |            |    ________|____             \n",
-      "     |            |   |   |         PP          \n",
-      "     |            |   |   |     ____|___         \n",
-      "     NP           |   |   |    |        NP      \n",
-      "  ___|_____       |   |   |    |     ___|____    \n",
-      " AT       NNS     V   AT  NN   IN   AT       NN \n",
-      " |         |      |   |   |    |    |        |   \n",
-      "The     children ate the cake with  a      spoon\n",
-      "\n"
-     ]
-    }
-   ],
-   "source": [
-    "#Displaying a ParseTree and finding the ambiguity of a given sentence\n",
-    "grammar = nltk.CFG.fromstring(\"\"\"\n",
-    "    S -> NP VP\n",
-    "    NP -> AT NNS | AT NN | AT NNS PP | AT NN PP\n",
-    "    VP -> V NP PP | V NP\n",
-    "    PP -> IN NP\n",
-    "    AT -> \"The\" | \"the\" | \"a\"\n",
-    "    NNS -> \"children\"\n",
-    "    V -> \"ate\"\n",
-    "    NN -> \"cake\" | \"spoon\"\n",
-    "    IN -> \"with\"\n",
-    "\"\"\")\n",
-    "\n",
-    "#parser = RecursiveDescentParser(grammar) #Can use nltk.ChartParser too\n",
-    "parser = RecursiveDescentParser(grammer)\n",
-    "print(\"PARSE TREES : \")\n",
-    "\n",
-    "tokens = word_tokenize(\"The children ate the cake with a spoon\")\n",
-    "\n",
-    "for tree in parser.parse(tokens) :\n",
-    "    tree.pretty_print()"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "092acc71-a390-4f6b-a6f7-c2e3fe2a82d5",
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.12.4"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}

noshot/data/ML TS XAI/NLP/NLP 6/2. T Test and Chi2 Test.ipynb DELETED Viewed

@@ -1,185 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "9b7114df-da26-4d09-8d80-d7f2f6e929e2",
-   "metadata": {},
-   "source": [
-    "# Experiment 2 :\n",
-    "<b>Perform t-Test and Chi-Square test to check whether a given sequence of words is a\r\n",
-    "collocation or not.</b>"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "0ce25ae9-d862-4999-a198-0681b3aad9e7",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import nltk\n",
-    "from nltk import word_tokenize, sent_tokenize\n",
-    "from nltk.corpus import gutenberg, stopwords\n",
-    "import string\n",
-    "\n",
-    "data = gutenberg.raw('austen-emma.txt')\n",
-    "\n",
-    "#PREPROCESSING THE GIVEN DATA\n",
-    "\n",
-    "#Tokenization, stopwords removal\n",
-    "sent_tokens = sent_tokenize(data)\n",
-    "word_tokens = []\n",
-    "for sentence in sent_tokens :\n",
-    "    sentence = sentence.translate(str.maketrans('', '', string.punctuation))\n",
-    "    word_tokens += word_tokenize(sentence)\n",
-    "stops = set(stopwords.words('english'))\n",
-    "word_tokens = [word for word in word_tokens if word.lower() not in stops]\n",
-    "\n",
-    "#Frequency, Propability\n",
-    "unique_words = set(word_tokens)\n",
-    "print(f\"TOTAL WORDS IN THE CORPUS : {len(word_tokens)}\")\n",
-    "print(f\"UNIQUE WORDS : {len(unique_words)}\")\n",
-    "\n",
-    "frequency = {word : word_tokens.count(word) for word in unique_words}\n",
-    "propability = {word : frequency[word]/len(word_tokens) for word in unique_words}"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "0d5fc074-8256-43e6-b57c-e62cde3b05d2",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "#Generating Bigrams, frequency and propability of bigrams\n",
-    "bigrams = zip(word_tokens[:-1], word_tokens[1:])\n",
-    "bigram_freq = {}\n",
-    "bigram_count = 0\n",
-    "for bigram in bigrams :\n",
-    "    bigram_count += 1\n",
-    "    if bigram in bigram_freq :\n",
-    "        bigram_freq[bigram] += 1\n",
-    "    else :\n",
-    "        bigram_freq[bigram] = 1\n",
-    "bigram_prop = {}\n",
-    "for bigram, freq in bigram_freq.items() : \n",
-    "    bigram_prop[bigram] = freq/bigram_count\n",
-    "print(\"TOTAL UNIQUE BIGRAMS :\", len(bigram_freq))"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "d064f6da-7399-4e4c-aa08-4d65c37443d5",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import math\n",
-    "from scipy.stats import t,chi2 #For Critical value(feed value if givem)\n",
-    "#T-test demonstration\n",
-    "t_colloc = []\n",
-    "n = len(word_tokens)\n",
-    "t_critical = t.ppf(1-0.05, n-1)\n",
-    "for bigram, prop in bigram_prop.items() :\n",
-    "    w1, w2 = bigram\n",
-    "    mu = propability[w1] * propability[w2]\n",
-    "    X_ = prop\n",
-    "    t_stat = (X_ - mu)/math.sqrt((X_*(1-X_))/n)\n",
-    "    if t_stat > t_critical :\n",
-    "        t_colloc.append(bigram)\n",
-    "print(f\"{len(t_colloc)} COLLOCATIONS IN THE CORPUS DETERMINED FROM T-TEST : \\n\")\n",
-    "print(t_colloc)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "329688e7-5256-44e3-bfe2-d78a7476b44b",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "#Chi^2 TEST demonstration\n",
-    "chi_colloc = []\n",
-    "n = len(word_tokens)\n",
-    "chi_critical = chi2.ppf(1-0.05, 1)\n",
-    "for bigram, prop in bigram_prop.items() :\n",
-    "    w1, w2 = bigram\n",
-    "    f1 = frequency[w1]\n",
-    "    f2 = frequency[w2]\n",
-    "    #Observed Frequencies\n",
-    "    o_1_2 = bigram_freq[bigram]\n",
-    "    o_n1_2 = f2 - o_1_2\n",
-    "    o_1_n2 = f1 - o_1_2\n",
-    "    o_n1_n2 = n - (o_1_2 + o_n1_2 + o_1_n2)\n",
-    "    obs = [o_1_2, o_n1_2, o_1_n2, o_n1_n2]\n",
-    "    #Excepcted frequencies\n",
-    "    e_1_2 = (f1 * f2)/n\n",
-    "    e_n1_2 = ((n - f1) * f2)/n\n",
-    "    e_1_n2 = (f1 * (n - f2))/n\n",
-    "    e_n1_n2 = ((n - f1)*(n - f2))/n\n",
-    "    exp = [e_1_2, e_n1_2, e_1_n2, e_n1_n2]\n",
-    "    chi_stat = sum( ((obs[i] - exp[i])**2)/exp[i] for i in range(4))\n",
-    "    if chi_stat > chi_critical :\n",
-    "        chi_colloc.append(bigram)\n",
-    "print(f\"{len(chi_colloc)} COLLOCATIONS IN THE CORPUS DETERMINED FROM CHI^2-TEST : \\n\")\n",
-    "print(chi_colloc)\n",
-    "    "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "0e4398af-30f7-4eaa-9b2c-fe55d1974d70",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "f1 = 15828\n",
-    "f2 = 4675\n",
-    "n = 14307676\n",
-    "#Observed Frequencies\n",
-    "o_1_2 = 8\n",
-    "o_n1_2 = f2 - o_1_2\n",
-    "o_1_n2 = f1 - o_1_2\n",
-    "o_n1_n2 = n - (o_1_2 + o_n1_2 + o_1_n2)\n",
-    "obs = [o_1_2, o_n1_2, o_1_n2, o_n1_n2]\n",
-    "#Excepcted frequencies\n",
-    "e_1_2 = (f1 * f2)/n\n",
-    "e_n1_2 = ((n - f1) * f2)/n\n",
-    "e_1_n2 = (f1 * (n - f2))/n\n",
-    "e_n1_n2 = ((n - f1)*(n - f2))/n\n",
-    "exp = [e_1_2, e_n1_2, e_1_n2, e_n1_n2]\n",
-    "chi_stat = sum( ((obs[i] - exp[i])**2)/exp[i] for i in range(4))\n",
-    "print(chi_stat)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "2f065df9-3caa-4f59-94b2-5c65b0fc47d0",
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.12.4"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}

noshot 0.1.7__py3-none-any.whl → 0.1.8__py3-none-any.whl

noshot 0.1.7py3-none-any.whl → 0.1.8py3-none-any.whl