当我开始学习 NLP 时,我决定编写这个 kernel。它基本上是我学到的东西以 Notebook 的形式记录下来的。如果你正在寻找 竞赛数据的数据分析、NLP 特征工程的想法、清洗和文本处理的想法、基线 BERT 模型 或 带标签的测试集,它可能对你有所帮助。
这个 kernel 包括以下 kernel 的代码和想法。如果这个 kernel 对你有帮助,请也给他们的工作点个赞。
Simple Exploration Notebook - QIQC by @sudalairajkumar
How to: Preprocessing when using embeddings by @christofhenkel
Improve your Score with some Text Preprocessing by @theoviel
A Real Disaster - Leaked Label by @szelee
Disaster NLP: Keras BERT using TFHub by @xhlulu
In [1]:
!pip install --user -q wordcloud tensorflow_hub
[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv[0m[33m
[0m
In [2]:
import gc
import re
import string
import operator
from collections import defaultdict
import numpy as np
import pandas as pd
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)
import matplotlib.pyplot as plt
import seaborn as sns
import tokenization
from wordcloud import STOPWORDS
from sklearn.model_selection import StratifiedKFold, StratifiedShuffleSplit
from sklearn.metrics import precision_score, recall_score, f1_score
import tensorflow as tf
import tensorflow_hub as hub
from tensorflow import keras
from tensorflow.keras.optimizers import SGD, Adam
from tensorflow.keras.layers import Dense, Input, Dropout, GlobalAveragePooling1D
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping, Callback
这个 kernel 包括以下 kernel 的代码和想法。如果这个 kernel 对你有帮助,请也给他们的工作点个赞。
Simple Exploration Notebook - QIQC by @sudalairajkumar
How to: Preprocessing when using embeddings by @christofhenkel
Improve your Score with some Text Preprocessing by @theoviel
A Real Disaster - Leaked Label by @szelee
Disaster NLP: Keras BERT using TFHub by @xhlulu
In [1]:
!pip install --user -q wordcloud tensorflow_hub
[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv[0m[33m
[0m
In [2]:
import gc
import re
import string
import operator
from collections import defaultdict
import numpy as np
import pandas as pd
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)
import matplotlib.pyplot as plt
import seaborn as sns
import tokenization
from wordcloud import STOPWORDS
from sklearn.model_selection import StratifiedKFold, StratifiedShuffleSplit
from sklearn.metrics import precision_score, recall_score, f1_score
import tensorflow as tf
import tensorflow_hub as hub
from tensorflow import keras
from tensorflow.keras.optimizers import SGD, Adam
from tensorflow.keras.layers import Dense, Input, Dropout, GlobalAveragePooling1D
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping, Callback
