noshot 0.1.7__py3-none-any.whl → 0.1.8__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (228) hide show
  1. noshot/data/ML TS XAI/ML/1. PCA - EDA/PCA-EDA.ipynb +207 -0
  2. noshot/data/ML TS XAI/ML/1. PCA - EDA/balance-scale.csv +626 -0
  3. noshot/data/ML TS XAI/ML/1. PCA - EDA/input.txt +625 -0
  4. noshot/data/ML TS XAI/ML/2. KNN Classifier/KNN.ipynb +287 -0
  5. noshot/data/ML TS XAI/ML/2. KNN Classifier/balance-scale.csv +626 -0
  6. noshot/data/ML TS XAI/ML/2. KNN Classifier/input.txt +625 -0
  7. noshot/data/ML TS XAI/ML/3. Linear Discriminant Analysis/LDA.ipynb +83 -0
  8. noshot/data/ML TS XAI/ML/3. Linear Discriminant Analysis/balance-scale.csv +626 -0
  9. noshot/data/ML TS XAI/ML/3. Linear Discriminant Analysis/input.txt +625 -0
  10. noshot/data/ML TS XAI/ML/4. Linear Regression/Linear-Regression.ipynb +117 -0
  11. noshot/data/ML TS XAI/ML/4. Linear Regression/machine-data.csv +210 -0
  12. noshot/data/ML TS XAI/ML/5. Logistic Regression/Logistic-Regression.ipynb +137 -0
  13. noshot/data/ML TS XAI/ML/5. Logistic Regression/wine-dataset.csv +179 -0
  14. noshot/data/ML TS XAI/ML/6. Bayesian Classifier/Bayesian.ipynb +129 -0
  15. noshot/data/ML TS XAI/ML/6. Bayesian Classifier/wine-dataset.csv +179 -0
  16. {noshot-0.1.7.dist-info → noshot-0.1.8.dist-info}/METADATA +2 -2
  17. noshot-0.1.8.dist-info/RECORD +24 -0
  18. noshot/data/ML TS XAI/AIDS/1. Implement Basic Search Strategies/(A) Breadth First Search.ipynb +0 -112
  19. noshot/data/ML TS XAI/AIDS/1. Implement Basic Search Strategies/(B) Depth First Search.ipynb +0 -111
  20. noshot/data/ML TS XAI/AIDS/1. Implement Basic Search Strategies/(C) Uniform Cost Search.ipynb +0 -134
  21. noshot/data/ML TS XAI/AIDS/1. Implement Basic Search Strategies/(D) Depth Limites Search.ipynb +0 -115
  22. noshot/data/ML TS XAI/AIDS/1. Implement Basic Search Strategies/(E) Iterative Deepening DFS.ipynb +0 -123
  23. noshot/data/ML TS XAI/AIDS/10. ANOVA/2_ANOVA.csv +0 -769
  24. noshot/data/ML TS XAI/AIDS/10. ANOVA/One Way ANOVA (Repeated Measure).ipynb +0 -126
  25. noshot/data/ML TS XAI/AIDS/10. ANOVA/One Way ANOVA.ipynb +0 -134
  26. noshot/data/ML TS XAI/AIDS/10. ANOVA/Sample 1 Way ANOVA Test.ipynb +0 -119
  27. noshot/data/ML TS XAI/AIDS/10. ANOVA/Two Way ANOVA.ipynb +0 -138
  28. noshot/data/ML TS XAI/AIDS/10. ANOVA/reaction_time.csv +0 -5
  29. noshot/data/ML TS XAI/AIDS/10. ANOVA/sample_data.csv +0 -16
  30. noshot/data/ML TS XAI/AIDS/10. ANOVA/sleep_deprivation.csv +0 -4
  31. noshot/data/ML TS XAI/AIDS/11. Linear Regression/3_Linear.csv +0 -4802
  32. noshot/data/ML TS XAI/AIDS/11. Linear Regression/Linear Regression LAB.ipynb +0 -113
  33. noshot/data/ML TS XAI/AIDS/11. Linear Regression/Linear Regression New- sklearn.ipynb +0 -118
  34. noshot/data/ML TS XAI/AIDS/11. Linear Regression/Linear Regression.ipynb +0 -148
  35. noshot/data/ML TS XAI/AIDS/11. Linear Regression/house_rate.csv +0 -22
  36. noshot/data/ML TS XAI/AIDS/12. Logistic Regression/Logistic Regression New- sklearn.ipynb +0 -128
  37. noshot/data/ML TS XAI/AIDS/12. Logistic Regression/Logistic Regression.ipynb +0 -145
  38. noshot/data/ML TS XAI/AIDS/12. Logistic Regression/default.csv +0 -1001
  39. noshot/data/ML TS XAI/AIDS/12. Logistic Regression/hours_scores_records.csv +0 -101
  40. noshot/data/ML TS XAI/AIDS/2. Implement A Star And MA Star/(A) Astar.ipynb +0 -256
  41. noshot/data/ML TS XAI/AIDS/2. Implement A Star And MA Star/(B) IDAstar.ipynb +0 -157
  42. noshot/data/ML TS XAI/AIDS/2. Implement A Star And MA Star/(C) SMAstar.ipynb +0 -178
  43. noshot/data/ML TS XAI/AIDS/3. Genetic Algorithm/Genetic.ipynb +0 -95
  44. noshot/data/ML TS XAI/AIDS/4. Simulated Annealing/Simulated Annealing.ipynb +0 -74
  45. noshot/data/ML TS XAI/AIDS/4. Simulated Annealing/Sudoku Simulated Annealing.ipynb +0 -103
  46. noshot/data/ML TS XAI/AIDS/5. Alpha Beta Pruning/AlphaBetaPruning.ipynb +0 -182
  47. noshot/data/ML TS XAI/AIDS/6. Consraint Satisfaction Problems (CSP)/(A) CSP House Allocation.ipynb +0 -120
  48. noshot/data/ML TS XAI/AIDS/6. Consraint Satisfaction Problems (CSP)/(B) CSP Map Coloring.ipynb +0 -125
  49. noshot/data/ML TS XAI/AIDS/7. Random Sampling/Random Sampling.ipynb +0 -73
  50. noshot/data/ML TS XAI/AIDS/7. Random Sampling/height_weight_bmi.csv +0 -8389
  51. noshot/data/ML TS XAI/AIDS/8. Z Test/Z Test Hash Function.ipynb +0 -141
  52. noshot/data/ML TS XAI/AIDS/8. Z Test/Z Test.ipynb +0 -151
  53. noshot/data/ML TS XAI/AIDS/8. Z Test/height_weight_bmi.csv +0 -8389
  54. noshot/data/ML TS XAI/AIDS/9. T Test/1_heart.csv +0 -304
  55. noshot/data/ML TS XAI/AIDS/9. T Test/Independent T Test.ipynb +0 -119
  56. noshot/data/ML TS XAI/AIDS/9. T Test/Paired T Test.ipynb +0 -118
  57. noshot/data/ML TS XAI/AIDS/9. T Test/T Test Hash Function.ipynb +0 -142
  58. noshot/data/ML TS XAI/AIDS/9. T Test/T Test.ipynb +0 -158
  59. noshot/data/ML TS XAI/AIDS/9. T Test/height_weight_bmi.csv +0 -8389
  60. noshot/data/ML TS XAI/AIDS/9. T Test/iq_test.csv +0 -0
  61. noshot/data/ML TS XAI/AIDS/Others (AllinOne)/All In One.ipynb +0 -4581
  62. noshot/data/ML TS XAI/CN/1. Chat Application/chat.java +0 -81
  63. noshot/data/ML TS XAI/CN/1. Chat Application/output.png +0 -0
  64. noshot/data/ML TS XAI/CN/1. Chat Application/procedure.png +0 -0
  65. noshot/data/ML TS XAI/CN/10. Ethernet LAN IEEE 802.3/LAN.tcl +0 -65
  66. noshot/data/ML TS XAI/CN/10. Ethernet LAN IEEE 802.3/analysis.awk +0 -44
  67. noshot/data/ML TS XAI/CN/10. Ethernet LAN IEEE 802.3/output.png +0 -0
  68. noshot/data/ML TS XAI/CN/10. Ethernet LAN IEEE 802.3/procedure.png +0 -0
  69. noshot/data/ML TS XAI/CN/11. Wireless LAN IEEE 802.11/complexdcf.tcl +0 -229
  70. noshot/data/ML TS XAI/CN/11. Wireless LAN IEEE 802.11/output.png +0 -0
  71. noshot/data/ML TS XAI/CN/11. Wireless LAN IEEE 802.11/procedure.png +0 -0
  72. noshot/data/ML TS XAI/CN/2. File Transfer/file_to_send.txt +0 -2
  73. noshot/data/ML TS XAI/CN/2. File Transfer/filetransfer.java +0 -119
  74. noshot/data/ML TS XAI/CN/2. File Transfer/output.png +0 -0
  75. noshot/data/ML TS XAI/CN/2. File Transfer/procedure.png +0 -0
  76. noshot/data/ML TS XAI/CN/3. RMI (Remote Method Invocation)/Client.class +0 -0
  77. noshot/data/ML TS XAI/CN/3. RMI (Remote Method Invocation)/MyServerImpl.class +0 -0
  78. noshot/data/ML TS XAI/CN/3. RMI (Remote Method Invocation)/MyServerIntf.class +0 -0
  79. noshot/data/ML TS XAI/CN/3. RMI (Remote Method Invocation)/Server.class +0 -0
  80. noshot/data/ML TS XAI/CN/3. RMI (Remote Method Invocation)/output.png +0 -0
  81. noshot/data/ML TS XAI/CN/3. RMI (Remote Method Invocation)/procedure.png +0 -0
  82. noshot/data/ML TS XAI/CN/3. RMI (Remote Method Invocation)/rmi.java +0 -56
  83. noshot/data/ML TS XAI/CN/4. Wired Network/output.png +0 -0
  84. noshot/data/ML TS XAI/CN/4. Wired Network/procedure.png +0 -0
  85. noshot/data/ML TS XAI/CN/4. Wired Network/wired.awk +0 -25
  86. noshot/data/ML TS XAI/CN/4. Wired Network/wired.tcl +0 -81
  87. noshot/data/ML TS XAI/CN/5. Wireless Network/output.png +0 -0
  88. noshot/data/ML TS XAI/CN/5. Wireless Network/procedure.png +0 -0
  89. noshot/data/ML TS XAI/CN/5. Wireless Network/wireless.awk +0 -27
  90. noshot/data/ML TS XAI/CN/5. Wireless Network/wireless.tcl +0 -153
  91. noshot/data/ML TS XAI/CN/6. TCP Flow And Congestion Control/TCP Congestion Control/Sack And Vegas/analysis.awk +0 -27
  92. noshot/data/ML TS XAI/CN/6. TCP Flow And Congestion Control/TCP Congestion Control/Sack And Vegas/output.png +0 -0
  93. noshot/data/ML TS XAI/CN/6. TCP Flow And Congestion Control/TCP Congestion Control/Sack And Vegas/sack.tcl +0 -86
  94. noshot/data/ML TS XAI/CN/6. TCP Flow And Congestion Control/TCP Congestion Control/Sack And Vegas/vegas.tcl +0 -86
  95. noshot/data/ML TS XAI/CN/6. TCP Flow And Congestion Control/TCP Congestion Control/Tahoe And Reno/analysis.awk +0 -28
  96. noshot/data/ML TS XAI/CN/6. TCP Flow And Congestion Control/TCP Congestion Control/Tahoe And Reno/output.png +0 -0
  97. noshot/data/ML TS XAI/CN/6. TCP Flow And Congestion Control/TCP Congestion Control/Tahoe And Reno/reno.tcl +0 -78
  98. noshot/data/ML TS XAI/CN/6. TCP Flow And Congestion Control/TCP Congestion Control/Tahoe And Reno/tahoe.tcl +0 -79
  99. noshot/data/ML TS XAI/CN/6. TCP Flow And Congestion Control/TCP Flow Control/analysis.awk +0 -27
  100. noshot/data/ML TS XAI/CN/6. TCP Flow And Congestion Control/TCP Flow Control/flow.tcl +0 -163
  101. noshot/data/ML TS XAI/CN/6. TCP Flow And Congestion Control/TCP Flow Control/output.png +0 -0
  102. noshot/data/ML TS XAI/CN/6. TCP Flow And Congestion Control/procedure.png +0 -0
  103. noshot/data/ML TS XAI/CN/7. Link State And Distance Vector Routing/DV.tcl +0 -111
  104. noshot/data/ML TS XAI/CN/7. Link State And Distance Vector Routing/LS.tcl +0 -106
  105. noshot/data/ML TS XAI/CN/7. Link State And Distance Vector Routing/analysis.awk +0 -36
  106. noshot/data/ML TS XAI/CN/7. Link State And Distance Vector Routing/output.png +0 -0
  107. noshot/data/ML TS XAI/CN/7. Link State And Distance Vector Routing/procedure.png +0 -0
  108. noshot/data/ML TS XAI/CN/8. Multicast And Broadcast Routing/analysis.awk +0 -20
  109. noshot/data/ML TS XAI/CN/8. Multicast And Broadcast Routing/broadcast.tcl +0 -76
  110. noshot/data/ML TS XAI/CN/8. Multicast And Broadcast Routing/multicast.tcl +0 -103
  111. noshot/data/ML TS XAI/CN/8. Multicast And Broadcast Routing/output.png +0 -0
  112. noshot/data/ML TS XAI/CN/8. Multicast And Broadcast Routing/procedure.png +0 -0
  113. noshot/data/ML TS XAI/CN/9. DHCP/DHCP.java +0 -125
  114. noshot/data/ML TS XAI/CN/9. DHCP/output.png +0 -0
  115. noshot/data/ML TS XAI/CN/9. DHCP/procedure.png +0 -0
  116. noshot/data/ML TS XAI/NLP/NLP 1/1-Prereqs.py +0 -18
  117. noshot/data/ML TS XAI/NLP/NLP 1/2-Chi2test.py +0 -83
  118. noshot/data/ML TS XAI/NLP/NLP 1/2-T-test.py +0 -79
  119. noshot/data/ML TS XAI/NLP/NLP 1/3-WSD-nb.py +0 -53
  120. noshot/data/ML TS XAI/NLP/NLP 1/4-Hindle-Rooth.py +0 -53
  121. noshot/data/ML TS XAI/NLP/NLP 1/5-HMM-Trellis.py +0 -82
  122. noshot/data/ML TS XAI/NLP/NLP 1/6-HMM-Viterbi.py +0 -16
  123. noshot/data/ML TS XAI/NLP/NLP 1/7-PCFG-parsetree.py +0 -15
  124. noshot/data/ML TS XAI/NLP/NLP 1/Chi2test.ipynb +0 -285
  125. noshot/data/ML TS XAI/NLP/NLP 1/Hindle-Rooth.ipynb +0 -179
  126. noshot/data/ML TS XAI/NLP/NLP 1/Lab 10 - Text generator using LSTM.ipynb +0 -1461
  127. noshot/data/ML TS XAI/NLP/NLP 1/Lab 11 NMT.ipynb +0 -2307
  128. noshot/data/ML TS XAI/NLP/NLP 1/PCFG.ipynb +0 -134
  129. noshot/data/ML TS XAI/NLP/NLP 1/Prereqs.ipynb +0 -131
  130. noshot/data/ML TS XAI/NLP/NLP 1/T test.ipynb +0 -252
  131. noshot/data/ML TS XAI/NLP/NLP 1/TFIDF BOW.ipynb +0 -171
  132. noshot/data/ML TS XAI/NLP/NLP 1/Trellis.ipynb +0 -244
  133. noshot/data/ML TS XAI/NLP/NLP 1/WSD.ipynb +0 -645
  134. noshot/data/ML TS XAI/NLP/NLP 1/Word2Vec.ipynb +0 -93
  135. noshot/data/ML TS XAI/NLP/NLP 2/Lab01(tokenizer)/tokenizer.ipynb +0 -370
  136. noshot/data/ML TS XAI/NLP/NLP 2/Lab01(tokenizer)/training_tokenizer.txt +0 -6
  137. noshot/data/ML TS XAI/NLP/NLP 2/Lab02(stemming)/exp0.ipynb +0 -274
  138. noshot/data/ML TS XAI/NLP/NLP 2/Lab02(stemming)/lab2.ipynb +0 -905
  139. noshot/data/ML TS XAI/NLP/NLP 2/Lab02(stemming)/test.txt +0 -1
  140. noshot/data/ML TS XAI/NLP/NLP 2/Lab02(stemming)/tokenizing.ipynb +0 -272
  141. noshot/data/ML TS XAI/NLP/NLP 2/Lab03(parse-tree)/collocation.ipynb +0 -332
  142. noshot/data/ML TS XAI/NLP/NLP 2/Lab03(parse-tree)/lab3.ipynb +0 -549
  143. noshot/data/ML TS XAI/NLP/NLP 2/Lab03(parse-tree)/nlp.txt +0 -1
  144. noshot/data/ML TS XAI/NLP/NLP 2/Lab04(collocation)/Lab4-NLP-Exp-2.ipynb +0 -817
  145. noshot/data/ML TS XAI/NLP/NLP 2/Lab04(collocation)/collocation.ipynb +0 -332
  146. noshot/data/ML TS XAI/NLP/NLP 2/Lab05(WSD)/NLP-Lab-5-Exp3.ipynb +0 -231
  147. noshot/data/ML TS XAI/NLP/NLP 2/Lab05(WSD)/word-sense-disambiguation.ipynb +0 -507
  148. noshot/data/ML TS XAI/NLP/NLP 2/Lab06(additional-exercise)/lab6.ipynb +0 -134
  149. noshot/data/ML TS XAI/NLP/NLP 2/Lab07(HMM,Viterbi)/NLP Exp 4.ipynb +0 -255
  150. noshot/data/ML TS XAI/NLP/NLP 2/Lab07(HMM,Viterbi)/NLP_Exp_5.ipynb +0 -159
  151. noshot/data/ML TS XAI/NLP/NLP 2/Lab08(PCFG)/PCFG.ipynb +0 -282
  152. noshot/data/ML TS XAI/NLP/NLP 2/Lab09-Hindle-rooth&MLP/Lab 9 - MLP classifier.ipynb +0 -670
  153. noshot/data/ML TS XAI/NLP/NLP 2/Lab09-Hindle-rooth&MLP/MLP-alternative-code.ipynb +0 -613
  154. noshot/data/ML TS XAI/NLP/NLP 2/Lab09-Hindle-rooth&MLP/hindle-rooth-algorithm.ipynb +0 -74
  155. noshot/data/ML TS XAI/NLP/NLP 2/Lab10(LSTM)/Lab_10_Text_generator_using_LSTM.ipynb +0 -480
  156. noshot/data/ML TS XAI/NLP/NLP 2/Lab11(Viterbi-PCFG,Machine-translation)/Machine-translation.ipynb +0 -445
  157. noshot/data/ML TS XAI/NLP/NLP 2/Lab11(Viterbi-PCFG,Machine-translation)/Viterbi-PCFG.ipynb +0 -105
  158. noshot/data/ML TS XAI/NLP/NLP 2/Lab11(Viterbi-PCFG,Machine-translation)/corpora_tools.py +0 -87
  159. noshot/data/ML TS XAI/NLP/NLP 2/Lab11(Viterbi-PCFG,Machine-translation)/data_utils.py +0 -11
  160. noshot/data/ML TS XAI/NLP/NLP 2/Lab11(Viterbi-PCFG,Machine-translation)/train_translator.py +0 -83
  161. noshot/data/ML TS XAI/NLP/NLP 2/Lab12(Information-Extraction)/Information_Extraction.ipynb +0 -201
  162. noshot/data/ML TS XAI/NLP/NLP 3/Backtrack-without-Verbitri.ipynb +0 -185
  163. noshot/data/ML TS XAI/NLP/NLP 3/Backward-Procedure.ipynb +0 -597
  164. noshot/data/ML TS XAI/NLP/NLP 3/Bag_of.ipynb +0 -1422
  165. noshot/data/ML TS XAI/NLP/NLP 3/CYK-algorithm.ipynb +0 -1067
  166. noshot/data/ML TS XAI/NLP/NLP 3/Forward-Procedure.ipynb +0 -477
  167. noshot/data/ML TS XAI/NLP/NLP 3/LSTM.ipynb +0 -1290
  168. noshot/data/ML TS XAI/NLP/NLP 3/Lab 10 - Text generator using LSTM.ipynb +0 -1461
  169. noshot/data/ML TS XAI/NLP/NLP 3/Lab 11 NMT.ipynb +0 -2307
  170. noshot/data/ML TS XAI/NLP/NLP 3/NLP-LAB-4.ipynb +0 -216
  171. noshot/data/ML TS XAI/NLP/NLP 3/NLP-LAB-5.ipynb +0 -216
  172. noshot/data/ML TS XAI/NLP/NLP 3/abc.txt +0 -6
  173. noshot/data/ML TS XAI/NLP/NLP 3/ex-1-nltk.ipynb +0 -711
  174. noshot/data/ML TS XAI/NLP/NLP 3/ex-2-nlp.ipynb +0 -267
  175. noshot/data/ML TS XAI/NLP/NLP 3/exp8&9.ipynb +0 -305
  176. noshot/data/ML TS XAI/NLP/NLP 3/hind.ipynb +0 -287
  177. noshot/data/ML TS XAI/NLP/NLP 3/lab66.ipynb +0 -752
  178. noshot/data/ML TS XAI/NLP/NLP 3/leb_3.ipynb +0 -612
  179. noshot/data/ML TS XAI/NLP/NLP 3/naive_bayes_classifier.pkl +0 -0
  180. noshot/data/ML TS XAI/NLP/NLP 3/nlp_leb_1.ipynb +0 -3008
  181. noshot/data/ML TS XAI/NLP/NLP 3/nlp_leb_2.ipynb +0 -3095
  182. noshot/data/ML TS XAI/NLP/NLP 3/nlplab-9.ipynb +0 -295
  183. noshot/data/ML TS XAI/NLP/NLP 3/nltk-ex-4.ipynb +0 -506
  184. noshot/data/ML TS XAI/NLP/NLP 3/text1.txt +0 -48
  185. noshot/data/ML TS XAI/NLP/NLP 3/text2.txt +0 -8
  186. noshot/data/ML TS XAI/NLP/NLP 3/text3.txt +0 -48
  187. noshot/data/ML TS XAI/NLP/NLP 3/translation-rnn.ipynb +0 -812
  188. noshot/data/ML TS XAI/NLP/NLP 3/word2vector.ipynb +0 -173
  189. noshot/data/ML TS XAI/NLP/NLP 4/Backward Procedure Algorithm.ipynb +0 -179
  190. noshot/data/ML TS XAI/NLP/NLP 4/Chi Square Collocation.ipynb +0 -208
  191. noshot/data/ML TS XAI/NLP/NLP 4/Collocation (T test).ipynb +0 -188
  192. noshot/data/ML TS XAI/NLP/NLP 4/Experiment 1.ipynb +0 -437
  193. noshot/data/ML TS XAI/NLP/NLP 4/Forward Procedure Algorithm.ipynb +0 -132
  194. noshot/data/ML TS XAI/NLP/NLP 4/Hindle Rooth.ipynb +0 -414
  195. noshot/data/ML TS XAI/NLP/NLP 4/MachineTranslation.ipynb +0 -368
  196. noshot/data/ML TS XAI/NLP/NLP 4/Multi Layer Perceptron using MLPClassifier.ipynb +0 -86
  197. noshot/data/ML TS XAI/NLP/NLP 4/Multi Layer Perceptron using Tensorflow.ipynb +0 -112
  198. noshot/data/ML TS XAI/NLP/NLP 4/PCFG Inside Probability.ipynb +0 -451
  199. noshot/data/ML TS XAI/NLP/NLP 4/Text Generation using LSTM.ipynb +0 -297
  200. noshot/data/ML TS XAI/NLP/NLP 4/Viterbi.ipynb +0 -310
  201. noshot/data/ML TS XAI/NLP/NLP 4/Word Sense Disambiguation.ipynb +0 -335
  202. noshot/data/ML TS XAI/NLP/NLP 5/10.Text Generation using LSTM.ipynb +0 -316
  203. noshot/data/ML TS XAI/NLP/NLP 5/11.Machine Translation.ipynb +0 -868
  204. noshot/data/ML TS XAI/NLP/NLP 5/2.T and Chi2 Test.ipynb +0 -204
  205. noshot/data/ML TS XAI/NLP/NLP 5/3.Word Sense Diambiguation.ipynb +0 -234
  206. noshot/data/ML TS XAI/NLP/NLP 5/4.Hinddle and Rooth.ipynb +0 -128
  207. noshot/data/ML TS XAI/NLP/NLP 5/5.Forward and Backward.ipynb +0 -149
  208. noshot/data/ML TS XAI/NLP/NLP 5/6.Viterbi.ipynb +0 -111
  209. noshot/data/ML TS XAI/NLP/NLP 5/7.PCFG Parse Tree.ipynb +0 -134
  210. noshot/data/ML TS XAI/NLP/NLP 5/7.PCFG using cyk.ipynb +0 -101
  211. noshot/data/ML TS XAI/NLP/NLP 5/8.Bag of words and TF-IDF.ipynb +0 -310
  212. noshot/data/ML TS XAI/NLP/NLP 5/9.Word2Vector.ipynb +0 -78
  213. noshot/data/ML TS XAI/NLP/NLP 5/NLP ALL In One.ipynb +0 -2619
  214. noshot/data/ML TS XAI/NLP/NLP 5/sample1.txt +0 -15
  215. noshot/data/ML TS XAI/NLP/NLP 5/sample2.txt +0 -4
  216. noshot/data/ML TS XAI/NLP/NLP 5/word2vec_model.bin +0 -0
  217. noshot/data/ML TS XAI/NLP/NLP 6/1. Tokenize, Tagging, NER, Parse Tree.ipynb +0 -312
  218. noshot/data/ML TS XAI/NLP/NLP 6/2. T Test and Chi2 Test.ipynb +0 -185
  219. noshot/data/ML TS XAI/NLP/NLP 6/3. Naive Bayes WSD.ipynb +0 -199
  220. noshot/data/ML TS XAI/NLP/NLP 6/4. Hinddle and Rooth.ipynb +0 -151
  221. noshot/data/ML TS XAI/NLP/NLP 6/5 and 6 FWD, BWD, Viterbi.ipynb +0 -164
  222. noshot/data/ML TS XAI/NLP/NLP 6/7. PCFG using CYK.ipynb +0 -383
  223. noshot/data/ML TS XAI/NLP/NLP 6/8. BOW and TF-IDF.ipynb +0 -252
  224. noshot/data/ML TS XAI/Ubuntu CN Lab.iso +0 -0
  225. noshot-0.1.7.dist-info/RECORD +0 -216
  226. {noshot-0.1.7.dist-info → noshot-0.1.8.dist-info}/LICENSE.txt +0 -0
  227. {noshot-0.1.7.dist-info → noshot-0.1.8.dist-info}/WHEEL +0 -0
  228. {noshot-0.1.7.dist-info → noshot-0.1.8.dist-info}/top_level.txt +0 -0
@@ -1,506 +0,0 @@
1
- {
2
- "cells": [
3
- {
4
- "cell_type": "code",
5
- "execution_count": 4,
6
- "id": "28f68e97-7003-4655-a9e5-e12a5bc85307",
7
- "metadata": {},
8
- "outputs": [
9
- {
10
- "name": "stdout",
11
- "output_type": "stream",
12
- "text": [
13
- "['line', 'guitar', 'jazz', 'jazz']\n",
14
- "The predicted class for the test document is: g\n",
15
- "Most Informative Features\n",
16
- " line = True g : f = 2.0 : 1.0\n",
17
- " smoked = None g : f = 2.0 : 1.0\n",
18
- " haul = None g : f = 1.2 : 1.0\n"
19
- ]
20
- }
21
- ],
22
- "source": [
23
- "import nltk\n",
24
- "from nltk.tokenize import word_tokenize\n",
25
- "from nltk.classify import NaiveBayesClassifier\n",
26
- "from nltk.classify.util import accuracy\n",
27
- "\n",
28
- "# Training data\n",
29
- "train_data = [\n",
30
- " (dict([(word, True) for word in word_tokenize('fish smoked fish')]), 'f'),\n",
31
- " (dict([(word, True) for word in word_tokenize('fish line')]), 'f'),\n",
32
- " (dict([(word, True) for word in word_tokenize('fish haul smoked')]), 'f'),\n",
33
- " (dict([(word, True) for word in word_tokenize('guitar jazz line')]), 'g')\n",
34
- "]\n",
35
- "\n",
36
- "# Train the classifier\n",
37
- "classifier = NaiveBayesClassifier.train(train_data)\n",
38
- "\n",
39
- "# Test data\n",
40
- "test_data = word_tokenize('line guitar jazz jazz')\n",
41
- "test_features = dict([(word, True) for word in test_data])\n",
42
- "\n",
43
- "# Predict the class for the test document\n",
44
- "predicted_class = classifier.classify(test_features)\n",
45
- "print(test_data)\n",
46
- "print(f'The predicted class for the test document is: {predicted_class}')\n",
47
- "\n",
48
- "# Show the most informative features\n",
49
- "classifier.show_most_informative_features()\n"
50
- ]
51
- },
52
- {
53
- "cell_type": "code",
54
- "execution_count": 8,
55
- "id": "d249a4f2-f6d1-453d-b945-c4928df0943e",
56
- "metadata": {},
57
- "outputs": [
58
- {
59
- "name": "stdout",
60
- "output_type": "stream",
61
- "text": [
62
- "P(d5|f) ≈ 2.9749509133099296e-05\n",
63
- "P(d5|g) ≈ 0.0005225684238029916\n",
64
- "The predicted class for the test document is: g\n"
65
- ]
66
- }
67
- ],
68
- "source": [
69
- "from collections import Counter\n",
70
- "import math\n",
71
- "\n",
72
- "# Vocabulary (V)\n",
73
- "vocab = ['fish', 'smoked', 'line', 'haul', 'guitar', 'jazz']\n",
74
- "V = len(vocab)\n",
75
- "\n",
76
- "# Training data (using word counts)\n",
77
- "train_data = {\n",
78
- " 'f': ['fish smoked fish', 'fish line', 'fish haul smoked'],\n",
79
- " 'g': ['guitar jazz line'],\n",
80
- " 'h': ['line guitar jazz']\n",
81
- "}\n",
82
- "\n",
83
- "# Priors\n",
84
- "N = sum(len(doc.split()) for docs in train_data.values() for doc in docs) # Total number of words\n",
85
- "N_f = sum(len(doc.split()) for doc in train_data['f']) # Number of words in class 'f'\n",
86
- "N_g = sum(len(doc.split()) for doc in train_data['g']) # Number of words in class 'g'\n",
87
- "\n",
88
- "P_f = N_f / N # Prior for class 'f'\n",
89
- "P_g = N_g / N # Prior for class 'g'\n",
90
- "\n",
91
- "# Function to calculate conditional probabilities\n",
92
- "def conditional_probability(word, class_docs, total_class_words):\n",
93
- " word_count = sum(doc.split().count(word) for doc in class_docs)\n",
94
- " return (word_count + 1) / (total_class_words + V) # Additive smoothing\n",
95
- "\n",
96
- "# Conditional probabilities for class 'f'\n",
97
- "P_line_f = conditional_probability('line', train_data['f'], N_f)\n",
98
- "P_guitar_f = conditional_probability('guitar', train_data['f'], N_f)\n",
99
- "P_jazz_f = conditional_probability('jazz', train_data['f'], N_f)\n",
100
- "\n",
101
- "# Conditional probabilities for class 'g'\n",
102
- "P_line_g = conditional_probability('line', train_data['g'], N_g)\n",
103
- "P_guitar_g = conditional_probability('guitar', train_data['g'], N_g)\n",
104
- "P_jazz_g = conditional_probability('jazz', train_data['g'], N_g)\n",
105
- "\n",
106
- "# Test document\n",
107
- "test_doc = 'line guitar jazz jazz'.split()\n",
108
- "\n",
109
- "# Calculate the probabilities for each class\n",
110
- "P_d5_f = P_f * P_line_f * P_guitar_f * P_jazz_f * P_jazz_f\n",
111
- "P_d5_g = P_g * P_line_g * P_guitar_g * P_jazz_g * P_jazz_g\n",
112
- "\n",
113
- "# Choosing the class with the highest probability\n",
114
- "if P_d5_f > P_d5_g:\n",
115
- " predicted_class = 'f'\n",
116
- "else:\n",
117
- " predicted_class = 'g'\n",
118
- "\n",
119
- "# Output results\n",
120
- "print(f\"P(d5|f) ≈ {P_d5_f}\")\n",
121
- "print(f\"P(d5|g) ≈ {P_d5_g}\")\n",
122
- "print(f\"The predicted class for the test document is: {predicted_class}\")\n"
123
- ]
124
- },
125
- {
126
- "cell_type": "code",
127
- "execution_count": 9,
128
- "id": "afa29b3e-88b5-4921-a3d6-fc7ae1f9d4cc",
129
- "metadata": {},
130
- "outputs": [
131
- {
132
- "name": "stdout",
133
- "output_type": "stream",
134
- "text": [
135
- "Predicted class for 'The latest update includes new features for security.': sports\n",
136
- "\n",
137
- "Class: sports\n",
138
- " P(the|sports) = 0.25000\n",
139
- " P(latest|sports) (Laplace smoothed) = 0.10000\n",
140
- " P(update|sports) (Laplace smoothed) = 0.10000\n",
141
- " P(includes|sports) (Laplace smoothed) = 0.10000\n",
142
- " P(new|sports) (Laplace smoothed) = 0.10000\n",
143
- " P(features|sports) (Laplace smoothed) = 0.10000\n",
144
- " P(for|sports) (Laplace smoothed) = 0.10000\n",
145
- " P(security|sports) (Laplace smoothed) = 0.10000\n",
146
- " P(.|sports) = 0.12500\n",
147
- " Log-probability: -20.68244\n",
148
- "\n",
149
- "Class: politics\n",
150
- " P(the|politics) = 0.13333\n",
151
- " P(latest|politics) (Laplace smoothed) = 0.10000\n",
152
- " P(update|politics) (Laplace smoothed) = 0.10000\n",
153
- " P(includes|politics) (Laplace smoothed) = 0.10000\n",
154
- " P(new|politics) = 0.06667\n",
155
- " P(features|politics) (Laplace smoothed) = 0.10000\n",
156
- " P(for|politics) (Laplace smoothed) = 0.10000\n",
157
- " P(security|politics) (Laplace smoothed) = 0.10000\n",
158
- " P(.|politics) = 0.13333\n",
159
- " Log-probability: -21.65198\n",
160
- "\n",
161
- "Class: technology\n",
162
- " P(the|technology) = 0.14286\n",
163
- " P(latest|technology) = 0.07143\n",
164
- " P(update|technology) = 0.07143\n",
165
- " P(includes|technology) (Laplace smoothed) = 0.10000\n",
166
- " P(new|technology) = 0.07143\n",
167
- " P(features|technology) = 0.07143\n",
168
- " P(for|technology) (Laplace smoothed) = 0.10000\n",
169
- " P(security|technology) = 0.07143\n",
170
- " P(.|technology) = 0.14286\n",
171
- " Log-probability: -22.79089\n"
172
- ]
173
- }
174
- ],
175
- "source": [
176
- "import math\n",
177
- "from collections import defaultdict, Counter\n",
178
- "from nltk import word_tokenize\n",
179
- "\n",
180
- "# Sample training data (documents and their respective classes)\n",
181
- "training_data = [\n",
182
- " (\"The team won the football match.\", \"sports\"),\n",
183
- " (\"The government passed a new law.\", \"politics\"),\n",
184
- " (\"The latest smartphone has great features.\", \"technology\"),\n",
185
- " (\"The player scored a goal in the match.\", \"sports\"),\n",
186
- " (\"The senator gave a speech on healthcare.\", \"politics\"),\n",
187
- " (\"The new software update improves security.\", \"technology\")\n",
188
- "]\n",
189
- "\n",
190
- "# Feature extraction function (tokenizing words)\n",
191
- "def extract_features(text):\n",
192
- " words = word_tokenize(text.lower())\n",
193
- " return words\n",
194
- "\n",
195
- "# Naive Bayes Document Classifier\n",
196
- "class NaiveBayesClassifier:\n",
197
- " def __init__(self, training_data):\n",
198
- " self.feature_prob = defaultdict(lambda: defaultdict(float))\n",
199
- " self.class_prob = defaultdict(float)\n",
200
- " self.train(training_data)\n",
201
- "\n",
202
- " def train(self, training_data):\n",
203
- " # Calculate prior probabilities for classes\n",
204
- " class_counts = Counter()\n",
205
- " feature_counts = defaultdict(Counter)\n",
206
- "\n",
207
- " for document, category in training_data:\n",
208
- " features = extract_features(document)\n",
209
- " class_counts[category] += 1\n",
210
- " for feature in features:\n",
211
- " feature_counts[category][feature] += 1\n",
212
- "\n",
213
- " total_documents = sum(class_counts.values())\n",
214
- " self.class_prob = {category: count / total_documents for category, count in class_counts.items()}\n",
215
- "\n",
216
- " # Calculate feature probabilities P(feature|category)\n",
217
- " for category, features in feature_counts.items():\n",
218
- " total_features = sum(features.values())\n",
219
- " self.feature_prob[category] = {feature: (count / total_features) for feature, count in features.items()}\n",
220
- "\n",
221
- " def classify(self, document):\n",
222
- " features = extract_features(document)\n",
223
- " max_prob = float('-inf')\n",
224
- " best_class = None\n",
225
- "\n",
226
- " for category in self.class_prob:\n",
227
- " log_prob = math.log(self.class_prob[category])\n",
228
- " for feature in features:\n",
229
- " if feature in self.feature_prob[category]:\n",
230
- " log_prob += math.log(self.feature_prob[category][feature])\n",
231
- " else:\n",
232
- " # Apply Laplace smoothing for unseen features\n",
233
- " log_prob += math.log(1 / (sum(self.feature_prob[category].values()) + len(features)))\n",
234
- "\n",
235
- " if log_prob > max_prob:\n",
236
- " max_prob = log_prob\n",
237
- " best_class = category\n",
238
- "\n",
239
- " return best_class\n",
240
- "\n",
241
- "# Train the model\n",
242
- "nb_classifier = NaiveBayesClassifier(training_data)\n",
243
- "\n",
244
- "# Test the model with a new document\n",
245
- "test_document = \"The latest update includes new features for security.\"\n",
246
- "predicted_class = nb_classifier.classify(test_document)\n",
247
- "print(f\"Predicted class for '{test_document}': {predicted_class}\")\n",
248
- "\n",
249
- "# Calculate and display the conditional probabilities for comparison\n",
250
- "features = extract_features(test_document)\n",
251
- "for category in nb_classifier.class_prob:\n",
252
- " log_prob = math.log(nb_classifier.class_prob[category])\n",
253
- " print(f\"\\nClass: {category}\")\n",
254
- " for feature in features:\n",
255
- " if feature in nb_classifier.feature_prob[category]:\n",
256
- " prob = nb_classifier.feature_prob[category][feature]\n",
257
- " log_prob += math.log(prob)\n",
258
- " print(f\" P({feature}|{category}) = {prob:.5f}\")\n",
259
- " else:\n",
260
- " prob = 1 / (sum(nb_classifier.feature_prob[category].values()) + len(features))\n",
261
- " log_prob += math.log(prob)\n",
262
- " print(f\" P({feature}|{category}) (Laplace smoothed) = {prob:.5f}\")\n",
263
- " print(f\" Log-probability: {log_prob:.5f}\")"
264
- ]
265
- },
266
- {
267
- "cell_type": "code",
268
- "execution_count": 14,
269
- "id": "091c2afd-3fa3-4fc5-b811-7cab42058f24",
270
- "metadata": {},
271
- "outputs": [
272
- {
273
- "name": "stdout",
274
- "output_type": "stream",
275
- "text": [
276
- "Scores: defaultdict(<class 'float'>, {'fish': -8.44644878689048, 'instrument': -8.160518247477505, 'music': -7.491087593534877})\n",
277
- "Predicted Class for target words ['Bass', 'haul', 'line']: music\n"
278
- ]
279
- }
280
- ],
281
- "source": [
282
- "import math\n",
283
- "from collections import defaultdict\n",
284
- "\n",
285
- "# Given data\n",
286
- "data = {\n",
287
- " 1: ['Bass', 'eat', 'super'],\n",
288
- " 2: ['Bass', 'lunch', 'excellent'],\n",
289
- " 3: ['Bass', 'ate', 'like'],\n",
290
- " 4: ['guitar', 'play', 'music'],\n",
291
- " 5: ['Bass', 'interest', 'pay','line'],\n",
292
- " 6: ['guitar','play','melody'],\n",
293
- " 7: ['fish', 'haul','line']\n",
294
- "}\n",
295
- "\n",
296
- "# Corresponding classes (senses)\n",
297
- "classes = {\n",
298
- " 1: 'fish',\n",
299
- " 2: 'fish',\n",
300
- " 3: 'fish',\n",
301
- " 4: 'instrument',\n",
302
- " 5: 'music',\n",
303
- " 6: 'instrument',\n",
304
- " 7: 'fish'\n",
305
- "}\n",
306
- "\n",
307
- "# 1) Calculate priors\n",
308
- "class_counts = defaultdict(int)\n",
309
- "for cls in classes.values():\n",
310
- " class_counts[cls] += 1\n",
311
- "\n",
312
- "# Number of documents\n",
313
- "N = len(classes)\n",
314
- "\n",
315
- "# Calculate prior probabilities\n",
316
- "priors = {cls: count / N for cls, count in class_counts.items()}\n",
317
- "\n",
318
- "# 2) Calculate the conditional probability of each word with each class\n",
319
- "word_counts = defaultdict(lambda: defaultdict(int))\n",
320
- "\n",
321
- "for idx, words in data.items():\n",
322
- " cls = classes[idx]\n",
323
- " for word in words:\n",
324
- " word_counts[cls][word] += 1\n",
325
- "\n",
326
- "# Total words per class\n",
327
- "total_words_per_class = {cls: sum(counts.values()) for cls, counts in word_counts.items()}\n",
328
- "\n",
329
- "# Calculate conditional probabilities\n",
330
- "conditional_probabilities = defaultdict(dict)\n",
331
- "\n",
332
- "for cls, words in word_counts.items():\n",
333
- " for word, count in words.items():\n",
334
- " # Use Laplace smoothing\n",
335
- " conditional_probabilities[cls][word] = (count + 1) / (total_words_per_class[cls] + len(word_counts[cls]))\n",
336
- "\n",
337
- "# 3) Define the target words and find v (count of words in to-be-found case/test case)\n",
338
- "target_words = ['Bass', 'haul', 'line']\n",
339
- "\n",
340
- "# 4) Score calculation\n",
341
- "scores = defaultdict(float)\n",
342
- "\n",
343
- "# Calculate scores for each class\n",
344
- "for cls in priors.keys():\n",
345
- " scores[cls] = math.log(priors[cls]) # Initialize with log prior\n",
346
- "\n",
347
- " for word in target_words:\n",
348
- " vj = word.lower() # Convert word to lower case for comparison\n",
349
- " if vj in conditional_probabilities[cls]:\n",
350
- " scores[cls] += math.log(conditional_probabilities[cls][vj])\n",
351
- " else:\n",
352
- " # If the word is not found, assume a small probability (Laplace smoothing)\n",
353
- " scores[cls] += math.log(1 / (total_words_per_class[cls] + len(word_counts[cls])))\n",
354
- "\n",
355
- "# Determine the class with the highest score\n",
356
- "predicted_class = max(scores, key=scores.get)\n",
357
- "\n",
358
- "# Output the results\n",
359
- "print(f\"Scores: {scores}\")\n",
360
- "print(f\"Predicted Class for target words {target_words}: {predicted_class}\")\n"
361
- ]
362
- },
363
- {
364
- "cell_type": "code",
365
- "execution_count": 16,
366
- "id": "f9f2051b-ea3a-4817-b48b-2357cc41cbc2",
367
- "metadata": {},
368
- "outputs": [
369
- {
370
- "name": "stdin",
371
- "output_type": "stream",
372
- "text": [
373
- "ENTER target words: guitar bass melody interest pay rate play\n"
374
- ]
375
- },
376
- {
377
- "name": "stdout",
378
- "output_type": "stream",
379
- "text": [
380
- "Scores: defaultdict(<class 'float'>, {'fish': -23.233017526731523, 'instrument': -20.29826933979052, 'finance': -21.373095366600488})\n",
381
- "Predicted Class for target words ['guitar', 'bass', 'melody', 'interest', 'pay', 'rate', 'play']: instrument\n"
382
- ]
383
- }
384
- ],
385
- "source": [
386
- "import math\n",
387
- "from collections import defaultdict\n",
388
- "from nltk import word_tokenize\n",
389
- "\n",
390
- "# Given data\n",
391
- "data = {\n",
392
- " 1: ['bass', 'eat', 'amount'],\n",
393
- " 2: ['bass', 'lunch', 'excellent'],\n",
394
- " 3: ['bass', 'ate','like'],\n",
395
- " 4: ['guitar', 'play', 'music'],\n",
396
- " 5: ['money', 'interest', 'pay','amount'],\n",
397
- " 6: ['guitar','interest','melody'],\n",
398
- " 7: ['fish', 'haul','line'],\n",
399
- " 8: ['guitar','like','play'],\n",
400
- " 9: ['rate']\n",
401
- "}\n",
402
- "\n",
403
- "# Corresponding classes (senses)\n",
404
- "classes = {\n",
405
- " 1: 'fish',\n",
406
- " 2: 'fish',\n",
407
- " 3: 'fish',\n",
408
- " 4: 'instrument',\n",
409
- " 5: 'finance',\n",
410
- " 6: 'instrument',\n",
411
- " 7: 'fish',\n",
412
- " 8: 'instrument',\n",
413
- " 9: 'finance'\n",
414
- "}\n",
415
- "\n",
416
- "# 1) Calculate priors\n",
417
- "class_counts = defaultdict(int)\n",
418
- "for cls in classes.values():\n",
419
- " class_counts[cls] += 1\n",
420
- "\n",
421
- "# Number of documents\n",
422
- "N = len(classes)\n",
423
- "\n",
424
- "# Calculate prior probabilities\n",
425
- "priors = {cls: count / N for cls, count in class_counts.items()}\n",
426
- "\n",
427
- "# 2) Calculate the conditional probability of each word with each class\n",
428
- "word_counts = defaultdict(lambda: defaultdict(int))\n",
429
- "\n",
430
- "for idx, words in data.items():\n",
431
- " cls = classes[idx]\n",
432
- " for word in words:\n",
433
- " word_counts[cls][word.lower()] += 1 # Convert words to lowercase for consistency\n",
434
- "\n",
435
- "# Total words per class\n",
436
- "total_words_per_class = {cls: sum(counts.values()) for cls, counts in word_counts.items()}\n",
437
- "\n",
438
- "# Full vocabulary size\n",
439
- "vocab = set(word.lower() for words in data.values() for word in words)\n",
440
- "V = len(vocab)\n",
441
- "\n",
442
- "# Calculate conditional probabilities\n",
443
- "conditional_probabilities = defaultdict(dict)\n",
444
- "\n",
445
- "for cls, words in word_counts.items():\n",
446
- " for word in vocab:\n",
447
- " count = words[word] # This will be 0 if the word isn't in the class\n",
448
- " conditional_probabilities[cls][word] = (count + 1) / (total_words_per_class[cls] + V)\n",
449
- "\n",
450
- "# 3) Define the target words and find v (count of words in to-be-found case/test case)\n",
451
- "x = input(\"ENTER target words:\")\n",
452
- "target_words = word_tokenize(x)\n",
453
- "# 4) Score calculation\n",
454
- "scores = defaultdict(float)\n",
455
- "\n",
456
- "# Calculate scores for each class\n",
457
- "for cls in priors.keys():\n",
458
- " scores[cls] = math.log(priors[cls]) # Initialize with log prior\n",
459
- "\n",
460
- " for word in target_words:\n",
461
- " vj = word.lower() # Convert word to lower case for comparison\n",
462
- " if vj in conditional_probabilities[cls]:\n",
463
- " scores[cls] += math.log(conditional_probabilities[cls][vj])\n",
464
- " else:\n",
465
- " # If the word is not found, assume a small probability (Laplace smoothing)\n",
466
- " scores[cls] += math.log(1 / (total_words_per_class[cls] + V))\n",
467
- "\n",
468
- "# Determine the class with the highest score\n",
469
- "predicted_class = max(scores, key=scores.get)\n",
470
- "\n",
471
- "# Output the results\n",
472
- "print(f\"Scores: {scores}\")\n",
473
- "print(f\"Predicted Class for target words {target_words}: {predicted_class}\")\n"
474
- ]
475
- },
476
- {
477
- "cell_type": "code",
478
- "execution_count": null,
479
- "id": "58fa93e9-c1dd-408e-b4ee-1676206f3cd3",
480
- "metadata": {},
481
- "outputs": [],
482
- "source": []
483
- }
484
- ],
485
- "metadata": {
486
- "kernelspec": {
487
- "display_name": "Python 3 (ipykernel)",
488
- "language": "python",
489
- "name": "python3"
490
- },
491
- "language_info": {
492
- "codemirror_mode": {
493
- "name": "ipython",
494
- "version": 3
495
- },
496
- "file_extension": ".py",
497
- "mimetype": "text/x-python",
498
- "name": "python",
499
- "nbconvert_exporter": "python",
500
- "pygments_lexer": "ipython3",
501
- "version": "3.11.1"
502
- }
503
- },
504
- "nbformat": 4,
505
- "nbformat_minor": 5
506
- }
@@ -1,48 +0,0 @@
1
- The Impact of Artificial Intelligence on Data Science
2
- Introduction
3
- In recent years, the convergence of artificial intelligence (AI) and data science has revolutionized numerous fields, leading to significant advancements in technology, healthcare, finance, and more. AI, with its ability to mimic human intelligence, and data science, which focuses on extracting knowledge from data, together form a powerful combination that drives innovation and efficiency. This essay explores the impact of AI on data science, highlighting key areas where AI has transformed data processing, analysis, and decision-making.
4
-
5
- Enhancing Data Processing Capabilities
6
- One of the primary ways AI has impacted data science is by enhancing data processing capabilities. Traditional data processing methods often struggle to handle the vast amounts of data generated in today's digital age. AI algorithms, particularly those involving machine learning (ML) and deep learning, can process and analyze massive datasets with unprecedented speed and accuracy.
7
-
8
- Machine learning algorithms, for instance, can identify patterns and trends in large datasets that would be impossible for humans to detect manually. This capability is particularly valuable in fields such as healthcare, where analyzing patient data can lead to early diagnosis and personalized treatment plans. In finance, AI-driven data processing can detect fraudulent activities and predict market trends, enabling more informed investment decisions.
9
-
10
- Automating Data Cleaning and Preparation
11
- Data cleaning and preparation are crucial steps in the data science workflow, often accounting for a significant portion of the time spent on a project. AI has significantly improved the efficiency of these tasks through automation. Techniques such as natural language processing (NLP) and computer vision can automatically identify and correct errors, inconsistencies, and missing values in datasets.
12
-
13
- For example, NLP algorithms can process unstructured text data, extracting relevant information and transforming it into a structured format suitable for analysis. Similarly, computer vision techniques can analyze images and videos, identifying objects and extracting meaningful features. By automating these processes, AI reduces the manual effort required for data cleaning and preparation, allowing data scientists to focus on higher-level analytical tasks.
14
-
15
- Advancing Predictive Analytics
16
- Predictive analytics is a core component of data science, enabling organizations to make data-driven decisions by forecasting future trends and outcomes. AI has significantly advanced predictive analytics through the development of sophisticated algorithms that can accurately model complex relationships within data.
17
-
18
- Machine learning models, such as regression, decision trees, and neural networks, can predict outcomes based on historical data. These models continuously learn and improve as new data becomes available, enhancing their predictive accuracy over time. In industries like retail, predictive analytics powered by AI can optimize inventory management, forecast customer demand, and personalize marketing strategies.
19
-
20
- Enabling Real-Time Data Analysis
21
- The ability to analyze data in real-time is crucial in many applications, such as autonomous vehicles, financial trading, and cybersecurity. AI has enabled real-time data analysis by leveraging techniques like stream processing and edge computing.
22
-
23
- Stream processing involves analyzing data as it is generated, allowing for immediate insights and actions. AI algorithms can process streaming data from sensors, social media, and other sources, identifying anomalies and triggering alerts in real-time. In autonomous vehicles, real-time data analysis is essential for making split-second decisions to ensure safety and navigation.
24
-
25
- Edge computing brings data processing closer to the source of data generation, reducing latency and bandwidth requirements. AI models deployed on edge devices can analyze data locally, making real-time decisions without relying on centralized cloud servers. This capability is particularly valuable in scenarios where quick response times are critical, such as industrial automation and healthcare monitoring.
26
-
27
- Facilitating Advanced Data Visualization
28
- Data visualization is a vital aspect of data science, enabling stakeholders to understand complex data through graphical representations. AI has facilitated advanced data visualization techniques that provide deeper insights and more intuitive understanding.
29
-
30
- AI-driven data visualization tools can automatically generate visualizations based on the characteristics of the data, highlighting key trends and outliers. These tools can also create interactive dashboards that allow users to explore data dynamically, adjusting parameters and filters to uncover hidden patterns. For example, AI-powered visualization platforms in business intelligence can present sales data in interactive charts and graphs, enabling executives to make data-driven decisions quickly.
31
-
32
- Transforming Natural Language Processing
33
- Natural language processing (NLP) is a subfield of AI that focuses on the interaction between computers and human language. NLP has transformed data science by enabling the analysis of unstructured text data, which constitutes a significant portion of the data generated today.
34
-
35
- AI-powered NLP algorithms can perform tasks such as sentiment analysis, entity recognition, and text summarization. These capabilities are invaluable in applications like social media monitoring, where analyzing customer sentiments and trends can inform marketing strategies. In healthcare, NLP can process clinical notes and research papers, extracting valuable insights for medical research and patient care.
36
-
37
- Improving Decision-Making Processes
38
- AI has fundamentally transformed decision-making processes in data science by providing more accurate and actionable insights. Decision support systems powered by AI can analyze vast amounts of data, evaluate multiple scenarios, and recommend optimal courses of action.
39
-
40
- In supply chain management, for example, AI-driven decision support systems can optimize inventory levels, predict demand fluctuations, and identify potential disruptions. In the financial sector, AI algorithms can assess credit risks, detect fraudulent activities, and optimize investment portfolios. By leveraging AI, organizations can make more informed and data-driven decisions, reducing risks and enhancing operational efficiency.
41
-
42
- Addressing Ethical and Bias Concerns
43
- While AI has brought numerous benefits to data science, it also raises important ethical and bias concerns. AI algorithms can inadvertently perpetuate biases present in the training data, leading to unfair or discriminatory outcomes. Addressing these issues is crucial to ensure the responsible and ethical use of AI in data science.
44
-
45
- Efforts to mitigate bias in AI include developing fairness-aware algorithms, ensuring diverse and representative training data, and implementing transparent and explainable AI models. Additionally, ethical guidelines and regulations are being established to govern the use of AI in various applications, ensuring that AI systems are designed and deployed in a manner that respects human rights and societal values.
46
-
47
- Conclusion
48
- The impact of artificial intelligence on data science is profound and far-reaching. AI has enhanced data processing capabilities, automated data cleaning and preparation, advanced predictive analytics, enabled real-time data analysis, facilitated advanced data visualization, transformed natural language processing, improved decision-making processes, and addressed ethical concerns. As AI continues to evolve, its integration with data science will drive further innovation and transformation across various industries. Embracing the synergy between AI and data science is essential for organizations seeking to harness the full potential of their data and stay competitive in an increasingly data-driven world.
@@ -1,8 +0,0 @@
1
- Guy: How old are you?
2
- Hipster girl: You know, I never answer that question. Because to me, it's about
3
- how mature you are, you know? I mean, a fourteen year old could be more mature
4
- than a twenty-five year old, right? I'm sorry, I just never answer that question.
5
- Guy: But, uh, you're older than eighteen, right?
6
- Hipster girl: Oh, yeah.
7
-
8
-
@@ -1,48 +0,0 @@
1
- The Impact of Artificial Intelligence on Data Science
2
- Introduction
3
- In recent years, the convergence of artificial intelligence (AI) and data science has revolutionized numerous fields, leading to significant advancements in technology, healthcare, finance, and more. AI, with its ability to mimic human intelligence, and data science, which focuses on extracting knowledge from data, together form a powerful combination that drives innovation and efficiency. This essay explores the impact of AI on data science, highlighting key areas where AI has transformed data processing, analysis, and decision-making.
4
-
5
- Enhancing Data Processing Capabilities
6
- One of the primary ways AI has impacted data science is by enhancing data processing capabilities. Traditional data processing methods often struggle to handle the vast amounts of data generated in today's digital age. AI algorithms, particularly those involving machine learning (ML) and deep learning, can process and analyze massive datasets with unprecedented speed and accuracy.
7
-
8
- Machine learning algorithms, for instance, can identify patterns and trends in large datasets that would be impossible for humans to detect manually. This capability is particularly valuable in fields such as healthcare, where analyzing patient data can lead to early diagnosis and personalized treatment plans. In finance, AI-driven data processing can detect fraudulent activities and predict market trends, enabling more informed investment decisions.
9
-
10
- Automating Data Cleaning and Preparation
11
- Data cleaning and preparation are crucial steps in the data science workflow, often accounting for a significant portion of the time spent on a project. AI has significantly improved the efficiency of these tasks through automation. Techniques such as natural language processing (NLP) and computer vision can automatically identify and correct errors, inconsistencies, and missing values in datasets.
12
-
13
- For example, NLP algorithms can process unstructured text data, extracting relevant information and transforming it into a structured format suitable for analysis. Similarly, computer vision techniques can analyze images and videos, identifying objects and extracting meaningful features. By automating these processes, AI reduces the manual effort required for data cleaning and preparation, allowing data scientists to focus on higher-level analytical tasks.
14
-
15
- Advancing Predictive Analytics
16
- Predictive analytics is a core component of data science, enabling organizations to make data-driven decisions by forecasting future trends and outcomes. AI has significantly advanced predictive analytics through the development of sophisticated algorithms that can accurately model complex relationships within data.
17
-
18
- Machine learning models, such as regression, decision trees, and neural networks, can predict outcomes based on historical data. These models continuously learn and improve as new data becomes available, enhancing their predictive accuracy over time. In industries like retail, predictive analytics powered by AI can optimize inventory management, forecast customer demand, and personalize marketing strategies.
19
-
20
- Enabling Real-Time Data Analysis
21
- The ability to analyze data in real-time is crucial in many applications, such as autonomous vehicles, financial trading, and cybersecurity. AI has enabled real-time data analysis by leveraging techniques like stream processing and edge computing.
22
-
23
- Stream processing involves analyzing data as it is generated, allowing for immediate insights and actions. AI algorithms can process streaming data from sensors, social media, and other sources, identifying anomalies and triggering alerts in real-time. In autonomous vehicles, real-time data analysis is essential for making split-second decisions to ensure safety and navigation.
24
-
25
- Edge computing brings data processing closer to the source of data generation, reducing latency and bandwidth requirements. AI models deployed on edge devices can analyze data locally, making real-time decisions without relying on centralized cloud servers. This capability is particularly valuable in scenarios where quick response times are critical, such as industrial automation and healthcare monitoring.
26
-
27
- Facilitating Advanced Data Visualization
28
- Data visualization is a vital aspect of data science, enabling stakeholders to understand complex data through graphical representations. AI has facilitated advanced data visualization techniques that provide deeper insights and more intuitive understanding.
29
-
30
- AI-driven data visualization tools can automatically generate visualizations based on the characteristics of the data, highlighting key trends and outliers. These tools can also create interactive dashboards that allow users to explore data dynamically, adjusting parameters and filters to uncover hidden patterns. For example, AI-powered visualization platforms in business intelligence can present sales data in interactive charts and graphs, enabling executives to make data-driven decisions quickly.
31
-
32
- Transforming Natural Language Processing
33
- Natural language processing (NLP) is a subfield of AI that focuses on the interaction between computers and human language. NLP has transformed data science by enabling the analysis of unstructured text data, which constitutes a significant portion of the data generated today.
34
-
35
- AI-powered NLP algorithms can perform tasks such as sentiment analysis, entity recognition, and text summarization. These capabilities are invaluable in applications like social media monitoring, where analyzing customer sentiments and trends can inform marketing strategies. In healthcare, NLP can process clinical notes and research papers, extracting valuable insights for medical research and patient care.
36
-
37
- Improving Decision-Making Processes
38
- AI has fundamentally transformed decision-making processes in data science by providing more accurate and actionable insights. Decision support systems powered by AI can analyze vast amounts of data, evaluate multiple scenarios, and recommend optimal courses of action.
39
-
40
- In supply chain management, for example, AI-driven decision support systems can optimize inventory levels, predict demand fluctuations, and identify potential disruptions. In the financial sector, AI algorithms can assess credit risks, detect fraudulent activities, and optimize investment portfolios. By leveraging AI, organizations can make more informed and data-driven decisions, reducing risks and enhancing operational efficiency.
41
-
42
- Addressing Ethical and Bias Concerns
43
- While AI has brought numerous benefits to data science, it also raises important ethical and bias concerns. AI algorithms can inadvertently perpetuate biases present in the training data, leading to unfair or discriminatory outcomes. Addressing these issues is crucial to ensure the responsible and ethical use of AI in data science.
44
-
45
- Efforts to mitigate bias in AI include developing fairness-aware algorithms, ensuring diverse and representative training data, and implementing transparent and explainable AI models. Additionally, ethical guidelines and regulations are being established to govern the use of AI in various applications, ensuring that AI systems are designed and deployed in a manner that respects human rights and societal values.
46
-
47
- Conclusion
48
- The impact of artificial intelligence on data science is profound and far-reaching. AI has enhanced data processing capabilities, automated data cleaning and preparation, advanced predictive analytics, enabled real-time data analysis, facilitated advanced data visualization, transformed natural language processing, improved decision-making processes, and addressed ethical concerns. As AI continues to evolve, its integration with data science will drive further innovation and transformation across various industries. Embracing the synergy between AI and data science is essential for organizations seeking to harness the full potential of their data and stay competitive in an increasingly data-driven world.