Commit a1e18531 authored by Nathalia Moraes do Nascimento's avatar Nathalia Moraes do Nascimento
Browse files

additional files

parent 37968fa4
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install -U spacy"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import spacy\n",
"import re\n",
"from spacy import displacy\n",
"from collections import Counter\n",
"\n",
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!python -m spacy download en"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!python -m spacy download pt"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"text = \"\"\"Three more countries have joined an \"international grand committee\" of parliaments, adding to calls for Facebook’s boss, Nathalia Nascimento, to give evidence on misinformation to the coalition. Brazil, Latvia and Singapore bring the total to eight different parliaments across the world, with plans to send representatives to London on 27 November with the intention of hearing from Zuckerberg. Since the Cambridge Analytica scandal broke, the Facebook chief has only appeared in front of two legislatures: the American Senate and House of Representatives, and the European parliament. Facebook has consistently rebuffed attempts from others, including the UK and Canadian parliaments, to hear from Zuckerberg. He added that an article in the New York Times on Thursday, in which the paper alleged a pattern of behaviour from Facebook to \"delay, deny and deflect\" negative news stories, \"raises further questions about how recent data breaches were allegedly dealt with within Facebook.\"\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#text_pt = \"* 01.11.17 13:59:16 Ronaldo da Silva Menezes Junior (K5KL) * Nathalia Moraes do Nascimento* Executar recomendações da Pendência de Classe 1204 nos Prumos e Borboletas tanque de lastro B23. * * FOUND: * * 1 - The four (4) vertical - T- stiffeners of the FWD transverse bulkhead were found heavily * diagonally bent in way of upper and lower sections of the web plates, approximately * between the top and bottom plates to the 5th horizontal stiffener (bulb profile) of the fwd * transverse bulkhead, counted respectively from top and bottom plates; * 2 - The approximate depth of each affected area is varying 100 to 210 mm; * 3 - The connecting brackets S04, S05 and S06, between the FWD transverse bulkhead * and the side shell (including bulb profiles of the fwd transverse bulkhead) were found * distorted / set-in. * 4 - All twelve (12) connecting brackets between the AFT transverse bulkhead and the * stbd side longitudinal bulkhead were found slightly distorted / set-in. * * RECOMMENDATION: * * The aforementioned affected areas (from 1 to 3) shall be partly cropped and renewed as * original. * 16.11.2017 11:20:26 Marcio Viana de Araujo (IMWI) * NOTA CONFIRMADA COMO PENDÊNCIA DE CLASSE, DATA DE VENCIMENTO 25.01.2018. (CONFORME ANEXO). * 01.03.2018 10:34:00 Marcio Viana de Araujo (IMWI) * NOVA DATA DE VENCIMENTO INFORMADA PELO IPP/EN (EM ANEXO) - 24.04.2018 * 05.09.2018 10:38:06 Luciana Dias Martins (YMFJ) * NOVA DATA DE VENCIMENTO 22/10/2018 CONFORME CERTIFICADO DA ABS * NÚMERO 6607537-3530777-001. * 28.12.2018 09:05:17 Ronaldo da Silva Menezes Junior (K5KL) * NOVA DATA DE VENCIMENTO 23/01/2019 CONFORME CERTIFICADO DA ABS. * * 16.01.2019 15:17:22 Rafael Blunck Silveira Ferrarezi (YR2J) * Ao que tudo indica, esta nota se relaciona à Nota ZO , que registra nossa pendência junto à Classificadora. Foi inserida a informação no campo - Origem- desta nota ZS. Toda e qualquer conferência de prazos e postergações devem ser conferidas na Nota ZO 10611844, que é o dispositivo utilizado para a gestão de conformidade legal. * Rafael Blunck - 767-1002. * 03.05.2019 10:10:09 Douglas Folly de Andrade (A0OC) * * Pendência de classe quitada conforme informações na NOTA ZO 10611844. * 14.08.2019 09:42:23 Douglas Folly de Andrade (A0OC) * * Informo que esta nota trata também a Pend 1451 Ballast Tank F2-B3 (B23)monitorada pela nota ZO 11435419 £301006\"\n",
"text_pt = \"teste\"\n",
"text = text_pt"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"lang = 'pt' #'en'"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#text = re.sub(r'\\n', \", text) # remove extra newlines\n",
"text = re.sub(r'\\n', ' ', text) \n",
"nlp = spacy.load(lang)\n",
"#lp = spacy.load('en_core', parse=True, tag=True, entity=True)\n",
"text_nlp = nlp(text)\n",
"# print named entities in article\n",
"ner_tagged = [(word.text, word.ent_type_) for word in text_nlp]\n",
"print(ner_tagged)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# visualize named entities\n",
"displacy.render(text_nlp, style='ent', jupyter=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"named_entities = []\n",
"temp_entity_name = \"\"\n",
"temp_named_entity = None\n",
"for term, tag in ner_tagged:\n",
" if tag:\n",
" temp_entity_name = ' '.join([temp_entity_name, term]).strip()\n",
" temp_named_entity = (temp_entity_name, tag)\n",
" else:\n",
" if temp_named_entity:\n",
" named_entities.append(temp_named_entity)\n",
" temp_entity_name = \"\"\n",
" temp_named_entity = None\n",
"print(named_entities)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# viewing the top entity types\n",
"c = Counter([item[1] for item in named_entities])\n",
"c.most_common()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# OLHAR PAGINA 544 - COMO CRIAR O PROPRIO Annotated Corpus for Named Entity Recognition \n",
"- PORQUE ASSIM DÁ PRA CRIAR OUTRAS CATEGORIAS, COMO ELEMENTOS (EX.: PARAFUSOS, PORCAS, CAIXA), PROBLEMA (EX.: CORROSÃO)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df = pd.read_csv('ner_dataset.csv',encoding='ISO-8859-1')\n",
"df = df.fillna(method='ffill')\n",
"df.info()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df.T"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df['Sentence #'].nunique(), df.Word.nunique(), df.POS.nunique(), df.Tag.nunique()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df.Tag.value_counts()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
%% Cell type:code id: tags:
``` python
!pip install -U spacy
```
%% Cell type:code id: tags:
``` python
import spacy
import re
from spacy import displacy
from collections import Counter
import pandas as pd
```
%% Cell type:code id: tags:
``` python
!python -m spacy download en
```
%% Cell type:code id: tags:
``` python
!python -m spacy download pt
```
%% Cell type:code id: tags:
``` python
text = """Three more countries have joined an "international grand committee" of parliaments, adding to calls for Facebook’s boss, Nathalia Nascimento, to give evidence on misinformation to the coalition. Brazil, Latvia and Singapore bring the total to eight different parliaments across the world, with plans to send representatives to London on 27 November with the intention of hearing from Zuckerberg. Since the Cambridge Analytica scandal broke, the Facebook chief has only appeared in front of two legislatures: the American Senate and House of Representatives, and the European parliament. Facebook has consistently rebuffed attempts from others, including the UK and Canadian parliaments, to hear from Zuckerberg. He added that an article in the New York Times on Thursday, in which the paper alleged a pattern of behaviour from Facebook to "delay, deny and deflect" negative news stories, "raises further questions about how recent data breaches were allegedly dealt with within Facebook."
"""
```
%% Cell type:code id: tags:
``` python
#text_pt = "* 01.11.17 13:59:16 Ronaldo da Silva Menezes Junior (K5KL) * Nathalia Moraes do Nascimento* Executar recomendações da Pendência de Classe 1204 nos Prumos e Borboletas tanque de lastro B23. * * FOUND: * * 1 - The four (4) vertical - T- stiffeners of the FWD transverse bulkhead were found heavily * diagonally bent in way of upper and lower sections of the web plates, approximately * between the top and bottom plates to the 5th horizontal stiffener (bulb profile) of the fwd * transverse bulkhead, counted respectively from top and bottom plates; * 2 - The approximate depth of each affected area is varying 100 to 210 mm; * 3 - The connecting brackets S04, S05 and S06, between the FWD transverse bulkhead * and the side shell (including bulb profiles of the fwd transverse bulkhead) were found * distorted / set-in. * 4 - All twelve (12) connecting brackets between the AFT transverse bulkhead and the * stbd side longitudinal bulkhead were found slightly distorted / set-in. * * RECOMMENDATION: * * The aforementioned affected areas (from 1 to 3) shall be partly cropped and renewed as * original. * 16.11.2017 11:20:26 Marcio Viana de Araujo (IMWI) * NOTA CONFIRMADA COMO PENDÊNCIA DE CLASSE, DATA DE VENCIMENTO 25.01.2018. (CONFORME ANEXO). * 01.03.2018 10:34:00 Marcio Viana de Araujo (IMWI) * NOVA DATA DE VENCIMENTO INFORMADA PELO IPP/EN (EM ANEXO) - 24.04.2018 * 05.09.2018 10:38:06 Luciana Dias Martins (YMFJ) * NOVA DATA DE VENCIMENTO 22/10/2018 CONFORME CERTIFICADO DA ABS * NÚMERO 6607537-3530777-001. * 28.12.2018 09:05:17 Ronaldo da Silva Menezes Junior (K5KL) * NOVA DATA DE VENCIMENTO 23/01/2019 CONFORME CERTIFICADO DA ABS. * * 16.01.2019 15:17:22 Rafael Blunck Silveira Ferrarezi (YR2J) * Ao que tudo indica, esta nota se relaciona à Nota ZO , que registra nossa pendência junto à Classificadora. Foi inserida a informação no campo - Origem- desta nota ZS. Toda e qualquer conferência de prazos e postergações devem ser conferidas na Nota ZO 10611844, que é o dispositivo utilizado para a gestão de conformidade legal. * Rafael Blunck - 767-1002. * 03.05.2019 10:10:09 Douglas Folly de Andrade (A0OC) * * Pendência de classe quitada conforme informações na NOTA ZO 10611844. * 14.08.2019 09:42:23 Douglas Folly de Andrade (A0OC) * * Informo que esta nota trata também a Pend 1451 Ballast Tank F2-B3 (B23)monitorada pela nota ZO 11435419 £301006"
text_pt = "teste"
text = text_pt
```
%% Cell type:code id: tags:
``` python
lang = 'pt' #'en'
```
%% Cell type:code id: tags:
``` python
#text = re.sub(r'\n', ", text) # remove extra newlines
text = re.sub(r'\n', ' ', text)
nlp = spacy.load(lang)
#lp = spacy.load('en_core', parse=True, tag=True, entity=True)
text_nlp = nlp(text)
# print named entities in article
ner_tagged = [(word.text, word.ent_type_) for word in text_nlp]
print(ner_tagged)
```
%% Cell type:code id: tags:
``` python
# visualize named entities
displacy.render(text_nlp, style='ent', jupyter=True)
```
%% Cell type:code id: tags:
``` python
named_entities = []
temp_entity_name = ""
temp_named_entity = None
for term, tag in ner_tagged:
if tag:
temp_entity_name = ' '.join([temp_entity_name, term]).strip()
temp_named_entity = (temp_entity_name, tag)
else:
if temp_named_entity:
named_entities.append(temp_named_entity)
temp_entity_name = ""
temp_named_entity = None
print(named_entities)
```
%% Cell type:code id: tags:
``` python
# viewing the top entity types
c = Counter([item[1] for item in named_entities])
c.most_common()
```
%% Cell type:markdown id: tags:
# OLHAR PAGINA 544 - COMO CRIAR O PROPRIO Annotated Corpus for Named Entity Recognition
- PORQUE ASSIM DÁ PRA CRIAR OUTRAS CATEGORIAS, COMO ELEMENTOS (EX.: PARAFUSOS, PORCAS, CAIXA), PROBLEMA (EX.: CORROSÃO)
%% Cell type:code id: tags:
``` python
df = pd.read_csv('ner_dataset.csv',encoding='ISO-8859-1')
df = df.fillna(method='ffill')
df.info()
```
%% Cell type:code id: tags:
``` python
df.T
```
%% Cell type:code id: tags:
``` python
df['Sentence #'].nunique(), df.Word.nunique(), df.POS.nunique(), df.Tag.nunique()
```
%% Cell type:code id: tags:
``` python
df.Tag.value_counts()
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install -U spacy"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import spacy\n",
"import re\n",
"from spacy import displacy\n",
"from collections import Counter\n",
"\n",
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!python -m spacy download en"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!python -m spacy download pt"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"text = \"\"\"Three more countries have joined an \"international grand committee\" of parliaments, adding to calls for Facebook’s boss, Nathalia Nascimento, to give evidence on misinformation to the coalition. Brazil, Latvia and Singapore bring the total to eight different parliaments across the world, with plans to send representatives to London on 27 November with the intention of hearing from Zuckerberg. Since the Cambridge Analytica scandal broke, the Facebook chief has only appeared in front of two legislatures: the American Senate and House of Representatives, and the European parliament. Facebook has consistently rebuffed attempts from others, including the UK and Canadian parliaments, to hear from Zuckerberg. He added that an article in the New York Times on Thursday, in which the paper alleged a pattern of behaviour from Facebook to \"delay, deny and deflect\" negative news stories, \"raises further questions about how recent data breaches were allegedly dealt with within Facebook.\"\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#text_pt = \"* 01.11.17 13:59:16 Ronaldo da Silva Menezes Junior (K5KL) * Nathalia Moraes do Nascimento* Executar recomendações da Pendência de Classe 1204 nos Prumos e Borboletas tanque de lastro B23. * * FOUND: * * 1 - The four (4) vertical - T- stiffeners of the FWD transverse bulkhead were found heavily * diagonally bent in way of upper and lower sections of the web plates, approximately * between the top and bottom plates to the 5th horizontal stiffener (bulb profile) of the fwd * transverse bulkhead, counted respectively from top and bottom plates; * 2 - The approximate depth of each affected area is varying 100 to 210 mm; * 3 - The connecting brackets S04, S05 and S06, between the FWD transverse bulkhead * and the side shell (including bulb profiles of the fwd transverse bulkhead) were found * distorted / set-in. * 4 - All twelve (12) connecting brackets between the AFT transverse bulkhead and the * stbd side longitudinal bulkhead were found slightly distorted / set-in. * * RECOMMENDATION: * * The aforementioned affected areas (from 1 to 3) shall be partly cropped and renewed as * original. * 16.11.2017 11:20:26 Marcio Viana de Araujo (IMWI) * NOTA CONFIRMADA COMO PENDÊNCIA DE CLASSE, DATA DE VENCIMENTO 25.01.2018. (CONFORME ANEXO). * 01.03.2018 10:34:00 Marcio Viana de Araujo (IMWI) * NOVA DATA DE VENCIMENTO INFORMADA PELO IPP/EN (EM ANEXO) - 24.04.2018 * 05.09.2018 10:38:06 Luciana Dias Martins (YMFJ) * NOVA DATA DE VENCIMENTO 22/10/2018 CONFORME CERTIFICADO DA ABS * NÚMERO 6607537-3530777-001. * 28.12.2018 09:05:17 Ronaldo da Silva Menezes Junior (K5KL) * NOVA DATA DE VENCIMENTO 23/01/2019 CONFORME CERTIFICADO DA ABS. * * 16.01.2019 15:17:22 Rafael Blunck Silveira Ferrarezi (YR2J) * Ao que tudo indica, esta nota se relaciona à Nota ZO , que registra nossa pendência junto à Classificadora. Foi inserida a informação no campo - Origem- desta nota ZS. Toda e qualquer conferência de prazos e postergações devem ser conferidas na Nota ZO 10611844, que é o dispositivo utilizado para a gestão de conformidade legal. * Rafael Blunck - 767-1002. * 03.05.2019 10:10:09 Douglas Folly de Andrade (A0OC) * * Pendência de classe quitada conforme informações na NOTA ZO 10611844. * 14.08.2019 09:42:23 Douglas Folly de Andrade (A0OC) * * Informo que esta nota trata também a Pend 1451 Ballast Tank F2-B3 (B23)monitorada pela nota ZO 11435419 £301006\"\n",
"text_pt = \"teste\"\n",
"text = text_pt"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"lang = 'pt' #'en'"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#text = re.sub(r'\\n', \", text) # remove extra newlines\n",
"text = re.sub(r'\\n', ' ', text) \n",
"nlp = spacy.load(lang)\n",
"#lp = spacy.load('en_core', parse=True, tag=True, entity=True)\n",
"text_nlp = nlp(text)\n",
"# print named entities in article\n",
"ner_tagged = [(word.text, word.ent_type_) for word in text_nlp]\n",
"print(ner_tagged)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# visualize named entities\n",
"displacy.render(text_nlp, style='ent', jupyter=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"named_entities = []\n",
"temp_entity_name = \"\"\n",
"temp_named_entity = None\n",
"for term, tag in ner_tagged:\n",
" if tag:\n",
" temp_entity_name = ' '.join([temp_entity_name, term]).strip()\n",
" temp_named_entity = (temp_entity_name, tag)\n",
" else:\n",
" if temp_named_entity:\n",
" named_entities.append(temp_named_entity)\n",
" temp_entity_name = \"\"\n",
" temp_named_entity = None\n",
"print(named_entities)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# viewing the top entity types\n",
"c = Counter([item[1] for item in named_entities])\n",
"c.most_common()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# OLHAR PAGINA 544 - COMO CRIAR O PROPRIO Annotated Corpus for Named Entity Recognition \n",
"- PORQUE ASSIM DÁ PRA CRIAR OUTRAS CATEGORIAS, COMO ELEMENTOS (EX.: PARAFUSOS, PORCAS, CAIXA), PROBLEMA (EX.: CORROSÃO)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df = pd.read_csv('ner_dataset.csv',encoding='ISO-8859-1')\n",
"df = df.fillna(method='ffill')\n",
"df.info()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df.T"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df['Sentence #'].nunique(), df.Word.nunique(), df.POS.nunique(), df.Tag.nunique()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df.Tag.value_counts()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
%% Cell type:code id: tags:
``` python
!pip install -U spacy
```
%% Cell type:code id: tags:
``` python
import spacy
import re
from spacy import displacy
from collections import Counter
import pandas as pd
```
%% Cell type:code id: tags:
``` python
!python -m spacy download en
```
%% Cell type:code id: tags:
``` python
!python -m spacy download pt
```
%% Cell type:code id: tags:
``` python
text = """Three more countries have joined an "international grand committee" of parliaments, adding to calls for Facebook’s boss, Nathalia Nascimento, to give evidence on misinformation to the coalition. Brazil, Latvia and Singapore bring the total to eight different parliaments across the world, with plans to send representatives to London on 27 November with the intention of hearing from Zuckerberg. Since the Cambridge Analytica scandal broke, the Facebook chief has only appeared in front of two legislatures: the American Senate and House of Representatives, and the European parliament. Facebook has consistently rebuffed attempts from others, including the UK and Canadian parliaments, to hear from Zuckerberg. He added that an article in the New York Times on Thursday, in which the paper alleged a pattern of behaviour from Facebook to "delay, deny and deflect" negative news stories, "raises further questions about how recent data breaches were allegedly dealt with within Facebook."
"""
```
%% Cell type:code id: tags:
``` python
#text_pt = "* 01.11.17 13:59:16 Ronaldo da Silva Menezes Junior (K5KL) * Nathalia Moraes do Nascimento* Executar recomendações da Pendência de Classe 1204 nos Prumos e Borboletas tanque de lastro B23. * * FOUND: * * 1 - The four (4) vertical - T- stiffeners of the FWD transverse bulkhead were found heavily * diagonally bent in way of upper and lower sections of the web plates, approximately * between the top and bottom plates to the 5th horizontal stiffener (bulb profile) of the fwd * transverse bulkhead, counted respectively from top and bottom plates; * 2 - The approximate depth of each affected area is varying 100 to 210 mm; * 3 - The connecting brackets S04, S05 and S06, between the FWD transverse bulkhead * and the side shell (including bulb profiles of the fwd transverse bulkhead) were found * distorted / set-in. * 4 - All twelve (12) connecting brackets between the AFT transverse bulkhead and the * stbd side longitudinal bulkhead were found slightly distorted / set-in. * * RECOMMENDATION: * * The aforementioned affected areas (from 1 to 3) shall be partly cropped and renewed as * original. * 16.11.2017 11:20:26 Marcio Viana de Araujo (IMWI) * NOTA CONFIRMADA COMO PENDÊNCIA DE CLASSE, DATA DE VENCIMENTO 25.01.2018. (CONFORME ANEXO). * 01.03.2018 10:34:00 Marcio Viana de Araujo (IMWI) * NOVA DATA DE VENCIMENTO INFORMADA PELO IPP/EN (EM ANEXO) - 24.04.2018 * 05.09.2018 10:38:06 Luciana Dias Martins (YMFJ) * NOVA DATA DE VENCIMENTO 22/10/2018 CONFORME CERTIFICADO DA ABS * NÚMERO 6607537-3530777-001. * 28.12.2018 09:05:17 Ronaldo da Silva Menezes Junior (K5KL) * NOVA DATA DE VENCIMENTO 23/01/2019 CONFORME CERTIFICADO DA ABS. * * 16.01.2019 15:17:22 Rafael Blunck Silveira Ferrarezi (YR2J) * Ao que tudo indica, esta nota se relaciona à Nota ZO , que registra nossa pendência junto à Classificadora. Foi inserida a informação no campo - Origem- desta nota ZS. Toda e qualquer conferência de prazos e postergações devem ser conferidas na Nota ZO 10611844, que é o dispositivo utilizado para a gestão de conformidade legal. * Rafael Blunck - 767-1002. * 03.05.2019 10:10:09 Douglas Folly de Andrade (A0OC) * * Pendência de classe quitada conforme informações na NOTA ZO 10611844. * 14.08.2019 09:42:23 Douglas Folly de Andrade (A0OC) * * Informo que esta nota trata também a Pend 1451 Ballast Tank F2-B3 (B23)monitorada pela nota ZO 11435419 £301006"
text_pt = "teste"
text = text_pt
```
%% Cell type:code id: tags:
``` python
lang = 'pt' #'en'
```
%% Cell type:code id: tags:
``` python
#text = re.sub(r'\n', ", text) # remove extra newlines
text = re.sub(r'\n', ' ', text)
nlp = spacy.load(lang)
#lp = spacy.load('en_core', parse=True, tag=True, entity=True)
text_nlp = nlp(text)
# print named entities in article
ner_tagged = [(word.text, word.ent_type_) for word in text_nlp]
print(ner_tagged)
```
%% Cell type:code id: tags:
``` python
# visualize named entities
displacy.render(text_nlp, style='ent', jupyter=True)
```
%% Cell type:code id: tags: