Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/stanfordnlp/stanza.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
path: root/demo
diff options
context:
space:
mode:
authorSecroLoL <alexshanvi@gmail.com>2022-06-28 21:37:11 +0300
committerJohn Bauer <horatio@gmail.com>2022-07-16 21:10:17 +0300
commit20714137d81e5e63d2bcee420b22c4fd2a871306 (patch)
tree83620779d1f64e60f42e42528696bf277b033186 /demo
parentdc5095290a4915bf7bb5afa60d5ab606fec27c30 (diff)
Document visualization with Spacy. Processes completed docs or raw strings
Includes right-to-left support (in the NER viz in particular, tags are flipped, for example) Added more documentation for usage, including necessary spaCy installations Includes Jupyter examples for visualization; spacy.render() functions well here Adding new Jupyter examples with support for new functions to visualize several strings with the same language pipeline
Diffstat (limited to 'demo')
-rw-r--r--demo/CONLL_Dependency_Visualizer_Example.ipynb70
-rw-r--r--demo/Dependency_Visualization_Testing.ipynb78
-rw-r--r--demo/NER_Visualization.ipynb88
-rw-r--r--demo/arabic_test.conllu.txt127
-rw-r--r--demo/en_test.conllu.txt79
-rw-r--r--demo/japanese_test.conllu.txt82
6 files changed, 524 insertions, 0 deletions
diff --git a/demo/CONLL_Dependency_Visualizer_Example.ipynb b/demo/CONLL_Dependency_Visualizer_Example.ipynb
new file mode 100644
index 00000000..9bc08c7c
--- /dev/null
+++ b/demo/CONLL_Dependency_Visualizer_Example.ipynb
@@ -0,0 +1,70 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "c0fd86c8",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from stanza.utils.visualization.conll_deprel_visualization import conll_to_visual\n",
+ "\n",
+ "# load necessary conllu files - expected to be in the demo directory along with the notebook\n",
+ "en_file = \"en_test.conllu.txt\"\n",
+ "\n",
+ "# testing left to right languages\n",
+ "conll_to_visual(en_file, \"en\", sent_count=2)\n",
+ "conll_to_visual(en_file, \"en\", sent_count=10)\n",
+ "#conll_to_visual(en_file, \"en\", display_all=True)\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "fc4b3f9b",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from stanza.utils.visualization.conll_deprel_visualization import conll_to_visual\n",
+ "\n",
+ "jp_file = \"japanese_test.conllu.txt\"\n",
+ "conll_to_visual(jp_file, \"ja\")\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "6852b8e8",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from stanza.utils.visualization.conll_deprel_visualization import conll_to_visual\n",
+ "\n",
+ "# testing right to left languages\n",
+ "ar_file = \"arabic_test.conllu.txt\"\n",
+ "conll_to_visual(ar_file, \"ar\")"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.8.3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/demo/Dependency_Visualization_Testing.ipynb b/demo/Dependency_Visualization_Testing.ipynb
new file mode 100644
index 00000000..0945233d
--- /dev/null
+++ b/demo/Dependency_Visualization_Testing.ipynb
@@ -0,0 +1,78 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "64b2a9e0",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from stanza.utils.visualization.dependency_visualization import visualize_strings\n",
+ "\n",
+ "ar_strings = ['برلين ترفض حصول شركة اميركية على رخصة تصنيع دبابة \"ليوبارد\" الالمانية', \"هل بإمكاني مساعدتك؟\", \n",
+ " \"أراك في مابعد\", \"لحظة من فضلك\"]\n",
+ "# Testing with right to left language\n",
+ "visualize_strings(ar_strings, \"ar\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "35ef521b",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from stanza.utils.visualization.dependency_visualization import visualize_strings\n",
+ "\n",
+ "en_strings = [\"This is a sentence.\", \n",
+ " \"He is wearing a red shirt\",\n",
+ " \"Barack Obama was born in Hawaii. He was elected President of the United States in 2008.\"]\n",
+ "# Testing with left to right languages\n",
+ "visualize_strings(en_strings, \"en\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "f3cf10ba",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from stanza.utils.visualization.dependency_visualization import visualize_strings\n",
+ "\n",
+ "zh_strings = [\"中国是一个很有意思的国家。\"]\n",
+ "# Testing with right to left language\n",
+ "visualize_strings(zh_strings, \"zh\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "d2b9b574",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.8.3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/demo/NER_Visualization.ipynb b/demo/NER_Visualization.ipynb
new file mode 100644
index 00000000..26630a05
--- /dev/null
+++ b/demo/NER_Visualization.ipynb
@@ -0,0 +1,88 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "abf300bb",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from stanza.utils.visualization.ner_visualization import visualize_strings\n",
+ "\n",
+ "en_strings = ['''Samuel Jackson, a Christian man from Utah, went to the JFK Airport for a flight to New York.\n",
+ " He was thinking of attending the US Open, his favorite tennis tournament besides Wimbledon.\n",
+ " That would be a dream trip, certainly not possible since it is $5000 attendance and 5000 miles away.\n",
+ " On the way there, he watched the Super Bowl for 2 hours and read War and Piece by Tolstoy for 1 hour.\n",
+ " In New York, he crossed the Brooklyn Bridge and listened to the 5th symphony of Beethoven as well as\n",
+ " \"All I want for Christmas is You\" by Mariah Carey.''', \n",
+ " \"Barack Obama was born in Hawaii. He was elected President of the United States in 2008\"]\n",
+ " \n",
+ "visualize_strings(en_strings, \"en\")\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "5670921a",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from stanza.utils.visualization.ner_visualization import visualize_strings\n",
+ "\n",
+ "zh_strings = ['''来自犹他州的基督徒塞缪尔杰克逊前往肯尼迪机场搭乘航班飞往纽约。\n",
+ " 他正在考虑参加美国公开赛,这是除了温布尔登之外他最喜欢的网球赛事。\n",
+ " 那将是一次梦想之旅,当然不可能,因为它的出勤费为 5000 美元,距离 5000 英里。\n",
+ " 在去的路上,他看了 2 个小时的超级碗比赛,看了 1 个小时的托尔斯泰的《战争与碎片》。\n",
+ " 在纽约,他穿过布鲁克林大桥,聆听了贝多芬的第五交响曲以及 玛丽亚凯莉的“圣诞节我想要的就是你”。''',\n",
+ " \"我觉得罗家费德勒住在加州, 在美国里面。\"]\n",
+ "visualize_strings(zh_strings, \"zh\", colors={\"PERSON\": \"yellow\", \"DATE\": \"red\", \"GPE\": \"blue\"})\n",
+ "visualize_strings(zh_strings, \"zh\", select=['PERSON', 'DATE'])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "b8d96072",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from stanza.utils.visualization.ner_visualization import visualize_strings\n",
+ "\n",
+ "ar_strings = [\".أعيش في سان فرانسيسكو ، كاليفورنيا. اسمي أليكس وأنا ألتحق بجامعة ستانفورد. أنا أدرس علوم الكمبيوتر وأستاذي هو كريس مانينغ\"\n",
+ " , \"اسمي أليكس ، أنا من الولايات المتحدة.\", \n",
+ " '''صامويل جاكسون ، رجل مسيحي من ولاية يوتا ، ذهب إلى مطار جون كنيدي في رحلة إلى نيويورك. كان يفكر في حضور بطولة الولايات المتحدة المفتوحة للتنس ، بطولة التنس المفضلة لديه إلى جانب بطولة ويمبلدون. ستكون هذه رحلة الأحلام ، وبالتأكيد ليست ممكنة لأنها تبلغ 5000 دولار للحضور و 5000 ميل. في الطريق إلى هناك ، شاهد Super Bowl لمدة ساعتين وقرأ War and Piece by Tolstoy لمدة ساعة واحدة. في نيويورك ، عبر جسر بروكلين واستمع إلى السيمفونية الخامسة لبيتهوفن وكذلك \"كل ما أريده في عيد الميلاد هو أنت\" لماريا كاري.''']\n",
+ "\n",
+ "visualize_strings(ar_strings, \"ar\", colors={\"PER\": \"pink\", \"LOC\": \"linear-gradient(90deg, #aa9cfc, #fc9ce7)\", \"ORG\": \"yellow\"})"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "22489b27",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.8.3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/demo/arabic_test.conllu.txt b/demo/arabic_test.conllu.txt
new file mode 100644
index 00000000..f422715e
--- /dev/null
+++ b/demo/arabic_test.conllu.txt
@@ -0,0 +1,127 @@
+# newdoc id = assabah.20041005.0017
+# newpar id = assabah.20041005.0017:p1
+# sent_id = assabah.20041005.0017:p1u1
+# text = سوريا: تعديل وزاري واسع يشمل 8 حقائب
+# orig_file_sentence ASB_ARB_20041005.0017#1
+1 سوريا سُورِيَا X X--------- Foreign=Yes 0 root 0:root SpaceAfter=No|Vform=سُورِيَا|Gloss=Syria|Root=sUr|Translit=sūriyā|LTranslit=sūriyā
+2 : : PUNCT G--------- _ 1 punct 1:punct Vform=:|Translit=:
+3 تعديل تَعدِيل NOUN N------S1I Case=Nom|Definite=Ind|Number=Sing 6 nsubj 6:nsubj Vform=تَعدِيلٌ|Gloss=adjustment,change,modification,amendment|Root=`_d_l|Translit=taʿdīlun|LTranslit=taʿdīl
+4 وزاري وِزَارِيّ ADJ A-----MS1I Case=Nom|Definite=Ind|Gender=Masc|Number=Sing 3 amod 3:amod Vform=وِزَارِيٌّ|Gloss=ministry,ministerial|Root=w_z_r|Translit=wizārīyun|LTranslit=wizārīy
+5 واسع وَاسِع ADJ A-----MS1I Case=Nom|Definite=Ind|Gender=Masc|Number=Sing 3 amod 3:amod Vform=وَاسِعٌ|Gloss=wide,extensive,broad|Root=w_s_`|Translit=wāsiʿun|LTranslit=wāsiʿ
+6 يشمل شَمِل VERB VIIA-3MS-- Aspect=Imp|Gender=Masc|Mood=Ind|Number=Sing|Person=3|VerbForm=Fin|Voice=Act 1 parataxis 1:parataxis Vform=يَشمَلُ|Gloss=comprise,include,contain|Root=^s_m_l|Translit=yašmalu|LTranslit=šamil
+7 8 8 NUM Q--------- NumForm=Digit 6 obj 6:obj Vform=٨|Translit=8
+8 حقائب حَقِيبَة NOUN N------P2I Case=Gen|Definite=Ind|Number=Plur 7 nmod 7:nmod:gen Vform=حَقَائِبَ|Gloss=briefcase,suitcase,portfolio,luggage|Root=.h_q_b|Translit=ḥaqāʾiba|LTranslit=ḥaqībat
+
+# newpar id = assabah.20041005.0017:p2
+# sent_id = assabah.20041005.0017:p2u1
+# text = دمشق (وكالات الانباء) - اجرى الرئيس السوري بشار الاسد تعديلا حكومياً واسعا تم بموجبه إقالة وزيري الداخلية والاعلام عن منصبيها في حين ظل محمد ناجي العطري رئيساً للحكومة.
+# orig_file_sentence ASB_ARB_20041005.0017#2
+1 دمشق دمشق X U--------- _ 0 root 0:root Vform=دمشق|Root=OOV|Translit=dmšq
+2 ( ( PUNCT G--------- _ 3 punct 3:punct SpaceAfter=No|Vform=(|Translit=(
+3 وكالات وِكَالَة NOUN N------P1R Case=Nom|Definite=Cons|Number=Plur 1 dep 1:dep Vform=وِكَالَاتُ|Gloss=agency|Root=w_k_l|Translit=wikālātu|LTranslit=wikālat
+4 الانباء نَبَأ NOUN N------P2D Case=Gen|Definite=Def|Number=Plur 3 nmod 3:nmod:gen SpaceAfter=No|Vform=اَلأَنبَاءِ|Gloss=news_item,report|Root=n_b_'|Translit=al-ʾanbāʾi|LTranslit=nabaʾ
+5 ) ) PUNCT G--------- _ 3 punct 3:punct Vform=)|Translit=)
+6 - - PUNCT G--------- _ 1 punct 1:punct Vform=-|Translit=-
+7 اجرى أَجرَى VERB VP-A-3MS-- Aspect=Perf|Gender=Masc|Number=Sing|Person=3|Voice=Act 1 advcl 1:advcl:فِي_حِينَ Vform=أَجرَى|Gloss=conduct,carry_out,perform|Root=^g_r_y|Translit=ʾaǧrā|LTranslit=ʾaǧrā
+8 الرئيس رَئِيس NOUN N------S1D Case=Nom|Definite=Def|Number=Sing 7 nsubj 7:nsubj Vform=اَلرَّئِيسُ|Gloss=president,head,chairman|Root=r_'_s|Translit=ar-raʾīsu|LTranslit=raʾīs
+9 السوري سُورِيّ ADJ A-----MS1D Case=Nom|Definite=Def|Gender=Masc|Number=Sing 8 amod 8:amod Vform=اَلسُّورِيُّ|Gloss=Syrian|Root=sUr|Translit=as-sūrīyu|LTranslit=sūrīy
+10 بشار بشار X U--------- _ 11 nmod 11:nmod Vform=بشار|Root=OOV|Translit=bšār
+11 الاسد الاسد X U--------- _ 8 nmod 8:nmod Vform=الاسد|Root=OOV|Translit=ālāsd
+12 تعديلا تَعدِيل NOUN N------S4I Case=Acc|Definite=Ind|Number=Sing 7 obj 7:obj Vform=تَعدِيلًا|Gloss=adjustment,change,modification,amendment|Root=`_d_l|Translit=taʿdīlan|LTranslit=taʿdīl
+13 حكومياً حُكُومِيّ ADJ A-----MS4I Case=Acc|Definite=Ind|Gender=Masc|Number=Sing 12 amod 12:amod Vform=حُكُومِيًّا|Gloss=governmental,state,official|Root=.h_k_m|Translit=ḥukūmīyan|LTranslit=ḥukūmīy
+14 واسعا وَاسِع ADJ A-----MS4I Case=Acc|Definite=Ind|Gender=Masc|Number=Sing 12 amod 12:amod Vform=وَاسِعًا|Gloss=wide,extensive,broad|Root=w_s_`|Translit=wāsiʿan|LTranslit=wāsiʿ
+15 تم تَمّ VERB VP-A-3MS-- Aspect=Perf|Gender=Masc|Number=Sing|Person=3|Voice=Act 12 acl 12:acl Vform=تَمَّ|Gloss=conclude,take_place|Root=t_m_m|Translit=tamma|LTranslit=tamm
+16-18 بموجبه _ _ _ _ _ _ _ _
+16 ب بِ ADP P--------- AdpType=Prep 18 case 18:case Vform=بِ|Gloss=by,with|Root=bi|Translit=bi|LTranslit=bi
+17 موجب مُوجِب NOUN N------S2R Case=Gen|Definite=Cons|Number=Sing 16 fixed 16:fixed Vform=مُوجِبِ|Gloss=reason,motive|Root=w_^g_b|Translit=mūǧibi|LTranslit=mūǧib
+18 ه هُوَ PRON SP---3MS2- Case=Gen|Gender=Masc|Number=Sing|Person=3|PronType=Prs 15 nmod 15:nmod:بِ_مُوجِب:gen Vform=هِ|Gloss=he,she,it|Translit=hi|LTranslit=huwa
+19 إقالة إِقَالَة NOUN N------S1R Case=Nom|Definite=Cons|Number=Sing 15 nsubj 15:nsubj Vform=إِقَالَةُ|Gloss=dismissal,discharge|Root=q_y_l|Translit=ʾiqālatu|LTranslit=ʾiqālat
+20 وزيري وَزِير NOUN N------D2R Case=Gen|Definite=Cons|Number=Dual 19 nmod 19:nmod:gen Vform=وَزِيرَي|Gloss=minister|Root=w_z_r|Translit=wazīray|LTranslit=wazīr
+21 الداخلية دَاخِلِيّ ADJ A-----FS2D Case=Gen|Definite=Def|Gender=Fem|Number=Sing 20 amod 20:amod Vform=اَلدَّاخِلِيَّةِ|Gloss=internal,domestic,interior,of_state|Root=d__h_l|Translit=ad-dāḫilīyati|LTranslit=dāḫilīy
+22-23 والاعلام _ _ _ _ _ _ _ _
+22 و وَ CCONJ C--------- _ 23 cc 23:cc Vform=وَ|Gloss=and|Root=wa|Translit=wa|LTranslit=wa
+23 الإعلام إِعلَام NOUN N------S2D Case=Gen|Definite=Def|Number=Sing 21 conj 20:amod|21:conj Vform=اَلإِعلَامِ|Gloss=information,media|Root=`_l_m|Translit=al-ʾiʿlāmi|LTranslit=ʾiʿlām
+24 عن عَن ADP P--------- AdpType=Prep 25 case 25:case Vform=عَن|Gloss=about,from|Root=`an|Translit=ʿan|LTranslit=ʿan
+25-26 منصبيها _ _ _ _ _ _ _ _
+25 منصبي مَنصِب NOUN N------D2R Case=Gen|Definite=Cons|Number=Dual 19 nmod 19:nmod:عَن:gen Vform=مَنصِبَي|Gloss=post,position,office|Root=n_.s_b|Translit=manṣibay|LTranslit=manṣib
+26 ها هُوَ PRON SP---3FS2- Case=Gen|Gender=Fem|Number=Sing|Person=3|PronType=Prs 25 nmod 25:nmod:gen Vform=هَا|Gloss=he,she,it|Translit=hā|LTranslit=huwa
+27 في فِي ADP P--------- AdpType=Prep 7 mark 7:mark Vform=فِي|Gloss=in|Root=fI|Translit=fī|LTranslit=fī
+28 حين حِينَ ADP PI------2- AdpType=Prep|Case=Gen 7 mark 7:mark Vform=حِينِ|Gloss=when|Root=.h_y_n|Translit=ḥīni|LTranslit=ḥīna
+29 ظل ظَلّ VERB VP-A-3MS-- Aspect=Perf|Gender=Masc|Number=Sing|Person=3|Voice=Act 7 parataxis 7:parataxis Vform=ظَلَّ|Gloss=remain,continue|Root=.z_l_l|Translit=ẓalla|LTranslit=ẓall
+30 محمد محمد X U--------- _ 32 nmod 32:nmod Vform=محمد|Root=OOV|Translit=mḥmd
+31 ناجي ناجي X U--------- _ 32 nmod 32:nmod Vform=ناجي|Root=OOV|Translit=nāǧy
+32 العطري العطري X U--------- _ 29 nsubj 29:nsubj Vform=العطري|Root=OOV|Translit=ālʿṭry
+33 رئيساً رَئِيس NOUN N------S4I Case=Acc|Definite=Ind|Number=Sing 29 xcomp 29:xcomp Vform=رَئِيسًا|Gloss=president,head,chairman|Root=r_'_s|Translit=raʾīsan|LTranslit=raʾīs
+34-35 للحكومة _ _ _ _ _ _ _ SpaceAfter=No
+34 ل لِ ADP P--------- AdpType=Prep 35 case 35:case Vform=لِ|Gloss=for,to|Root=l|Translit=li|LTranslit=li
+35 الحكومة حُكُومَة NOUN N------S2D Case=Gen|Definite=Def|Number=Sing 33 nmod 33:nmod:لِ:gen Vform=اَلحُكُومَةِ|Gloss=government,administration|Root=.h_k_m|Translit=al-ḥukūmati|LTranslit=ḥukūmat
+36 . . PUNCT G--------- _ 1 punct 1:punct Vform=.|Translit=.
+
+# newpar id = assabah.20041005.0017:p3
+# sent_id = assabah.20041005.0017:p3u1
+# text = واضافت المصادر ان مهدي دخل الله رئيس تحرير صحيفة الحزب الحاكم والليبرالي التوجهات تسلم منصب وزير الاعلام خلفا لاحمد الحسن فيما تسلم اللواء غازي كنعان رئيس شعبة الامن السياسي منصب وزير الداخلية.
+# orig_file_sentence ASB_ARB_20041005.0017#3
+1-2 واضافت _ _ _ _ _ _ _ _
+1 و وَ CCONJ C--------- _ 0 root 0:root Vform=وَ|Gloss=and|Root=wa|Translit=wa|LTranslit=wa
+2 أضافت أَضَاف VERB VP-A-3FS-- Aspect=Perf|Gender=Fem|Number=Sing|Person=3|Voice=Act 1 parataxis 1:parataxis Vform=أَضَافَت|Gloss=add,attach,receive_as_guest|Root=.d_y_f|Translit=ʾaḍāfat|LTranslit=ʾaḍāf
+3 المصادر مَصدَر NOUN N------P1D Case=Nom|Definite=Def|Number=Plur 2 nsubj 2:nsubj Vform=اَلمَصَادِرُ|Gloss=source|Root=.s_d_r|Translit=al-maṣādiru|LTranslit=maṣdar
+4 ان أَنَّ SCONJ C--------- _ 16 mark 16:mark Vform=أَنَّ|Gloss=that|Root='_n|Translit=ʾanna|LTranslit=ʾanna
+5 مهدي مهدي X U--------- _ 6 nmod 6:nmod Vform=مهدي|Root=OOV|Translit=mhdy
+6 دخل دخل X U--------- _ 16 nsubj 16:nsubj Vform=دخل|Root=OOV|Translit=dḫl
+7 الله الله X U--------- _ 6 nmod 6:nmod Vform=الله|Root=OOV|Translit=āllh
+8 رئيس رَئِيس NOUN N------S4R Case=Acc|Definite=Cons|Number=Sing 6 nmod 6:nmod:acc Vform=رَئِيسَ|Gloss=president,head,chairman|Root=r_'_s|Translit=raʾīsa|LTranslit=raʾīs
+9 تحرير تَحرِير NOUN N------S2R Case=Gen|Definite=Cons|Number=Sing 8 nmod 8:nmod:gen Vform=تَحرِيرِ|Gloss=liberation,liberating,editorship,editing|Root=.h_r_r|Translit=taḥrīri|LTranslit=taḥrīr
+10 صحيفة صَحِيفَة NOUN N------S2R Case=Gen|Definite=Cons|Number=Sing 9 nmod 9:nmod:gen Vform=صَحِيفَةِ|Gloss=newspaper,sheet,leaf|Root=.s_.h_f|Translit=ṣaḥīfati|LTranslit=ṣaḥīfat
+11 الحزب حِزب NOUN N------S2D Case=Gen|Definite=Def|Number=Sing 10 nmod 10:nmod:gen Vform=اَلحِزبِ|Gloss=party,band|Root=.h_z_b|Translit=al-ḥizbi|LTranslit=ḥizb
+12 الحاكم حَاكِم NOUN N------S2D Case=Gen|Definite=Def|Number=Sing 11 nmod 11:nmod:gen Vform=اَلحَاكِمِ|Gloss=ruler,governor|Root=.h_k_m|Translit=al-ḥākimi|LTranslit=ḥākim
+13-14 والليبرالي _ _ _ _ _ _ _ _
+13 و وَ CCONJ C--------- _ 6 cc 6:cc Vform=وَ|Gloss=and|Root=wa|Translit=wa|LTranslit=wa
+14 الليبرالي لِيبِرَالِيّ ADJ A-----MS4D Case=Acc|Definite=Def|Gender=Masc|Number=Sing 6 amod 6:amod Vform=اَللِّيبِرَالِيَّ|Gloss=liberal|Root=lIbirAl|Translit=al-lībirālīya|LTranslit=lībirālīy
+15 التوجهات تَوَجُّه NOUN N------P2D Case=Gen|Definite=Def|Number=Plur 14 nmod 14:nmod:gen Vform=اَلتَّوَجُّهَاتِ|Gloss=attitude,approach|Root=w_^g_h|Translit=at-tawaǧǧuhāti|LTranslit=tawaǧǧuh
+16 تسلم تَسَلَّم VERB VP-A-3MS-- Aspect=Perf|Gender=Masc|Number=Sing|Person=3|Voice=Act 2 ccomp 2:ccomp Vform=تَسَلَّمَ|Gloss=receive,assume|Root=s_l_m|Translit=tasallama|LTranslit=tasallam
+17 منصب مَنصِب NOUN N------S4R Case=Acc|Definite=Cons|Number=Sing 16 obj 16:obj Vform=مَنصِبَ|Gloss=post,position,office|Root=n_.s_b|Translit=manṣiba|LTranslit=manṣib
+18 وزير وَزِير NOUN N------S2R Case=Gen|Definite=Cons|Number=Sing 17 nmod 17:nmod:gen Vform=وَزِيرِ|Gloss=minister|Root=w_z_r|Translit=wazīri|LTranslit=wazīr
+19 الاعلام عَلَم NOUN N------P2D Case=Gen|Definite=Def|Number=Plur 18 nmod 18:nmod:gen Vform=اَلأَعلَامِ|Gloss=flag,banner,badge|Root=`_l_m|Translit=al-ʾaʿlāmi|LTranslit=ʿalam
+20 خلفا خَلَف NOUN N------S4I Case=Acc|Definite=Ind|Number=Sing 16 obl 16:obl:acc Vform=خَلَفًا|Gloss=substitute,scion|Root=_h_l_f|Translit=ḫalafan|LTranslit=ḫalaf
+21-22 لاحمد _ _ _ _ _ _ _ _
+21 ل لِ ADP P--------- AdpType=Prep 23 case 23:case Vform=لِ|Gloss=for,to|Root=l|Translit=li|LTranslit=li
+22 أحمد أَحمَد NOUN N------S2I Case=Gen|Definite=Ind|Number=Sing 23 nmod 23:nmod:gen Vform=أَحمَدَ|Gloss=Ahmad|Root=.h_m_d|Translit=ʾaḥmada|LTranslit=ʾaḥmad
+23 الحسن الحسن X U--------- _ 20 nmod 20:nmod:لِ Vform=الحسن|Root=OOV|Translit=ālḥsn
+24 فيما فِيمَا CCONJ C--------- _ 25 cc 25:cc Vform=فِيمَا|Gloss=while,during_which|Root=fI|Translit=fīmā|LTranslit=fīmā
+25 تسلم تَسَلَّم VERB VP-A-3MS-- Aspect=Perf|Gender=Masc|Number=Sing|Person=3|Voice=Act 16 conj 2:ccomp|16:conj Vform=تَسَلَّمَ|Gloss=receive,assume|Root=s_l_m|Translit=tasallama|LTranslit=tasallam
+26 اللواء لِوَاء NOUN N------S1D Case=Nom|Definite=Def|Number=Sing 25 nsubj 25:nsubj Vform=اَللِّوَاءُ|Gloss=banner,flag|Root=l_w_y|Translit=al-liwāʾu|LTranslit=liwāʾ
+27 غازي غازي X U--------- _ 28 nmod 28:nmod Vform=غازي|Root=OOV|Translit=ġāzy
+28 كنعان كنعان X U--------- _ 26 nmod 26:nmod Vform=كنعان|Root=OOV|Translit=knʿān
+29 رئيس رَئِيس NOUN N------S1R Case=Nom|Definite=Cons|Number=Sing 26 nmod 26:nmod:nom Vform=رَئِيسُ|Gloss=president,head,chairman|Root=r_'_s|Translit=raʾīsu|LTranslit=raʾīs
+30 شعبة شُعبَة NOUN N------S2R Case=Gen|Definite=Cons|Number=Sing 29 nmod 29:nmod:gen Vform=شُعبَةِ|Gloss=branch,subdivision|Root=^s_`_b|Translit=šuʿbati|LTranslit=šuʿbat
+31 الامن أَمن NOUN N------S2D Case=Gen|Definite=Def|Number=Sing 30 nmod 30:nmod:gen Vform=اَلأَمنِ|Gloss=security,safety|Root='_m_n|Translit=al-ʾamni|LTranslit=ʾamn
+32 السياسي سِيَاسِيّ ADJ A-----MS2D Case=Gen|Definite=Def|Gender=Masc|Number=Sing 31 amod 31:amod Vform=اَلسِّيَاسِيِّ|Gloss=political|Root=s_w_s|Translit=as-siyāsīyi|LTranslit=siyāsīy
+33 منصب مَنصِب NOUN N------S4R Case=Acc|Definite=Cons|Number=Sing 25 obj 25:obj Vform=مَنصِبَ|Gloss=post,position,office|Root=n_.s_b|Translit=manṣiba|LTranslit=manṣib
+34 وزير وَزِير NOUN N------S2R Case=Gen|Definite=Cons|Number=Sing 33 nmod 33:nmod:gen Vform=وَزِيرِ|Gloss=minister|Root=w_z_r|Translit=wazīri|LTranslit=wazīr
+35 الداخلية دَاخِلِيّ ADJ A-----FS2D Case=Gen|Definite=Def|Gender=Fem|Number=Sing 34 amod 34:amod SpaceAfter=No|Vform=اَلدَّاخِلِيَّةِ|Gloss=internal,domestic,interior,of_state|Root=d__h_l|Translit=ad-dāḫilīyati|LTranslit=dāḫilīy
+36 . . PUNCT G--------- _ 1 punct 1:punct Vform=.|Translit=.
+
+# newpar id = assabah.20041005.0017:p4
+# sent_id = assabah.20041005.0017:p4u1
+# text = وذكرت وكالة الانباء السورية ان التعديل شمل ثماني حقائب بينها وزارتا الداخلية والاقتصاد.
+# orig_file_sentence ASB_ARB_20041005.0017#4
+1-2 وذكرت _ _ _ _ _ _ _ _
+1 و وَ CCONJ C--------- _ 0 root 0:root Vform=وَ|Gloss=and|Root=wa|Translit=wa|LTranslit=wa
+2 ذكرت ذَكَر VERB VP-A-3FS-- Aspect=Perf|Gender=Fem|Number=Sing|Person=3|Voice=Act 1 parataxis 1:parataxis Vform=ذَكَرَت|Gloss=mention,cite,remember|Root=_d_k_r|Translit=ḏakarat|LTranslit=ḏakar
+3 وكالة وِكَالَة NOUN N------S1R Case=Nom|Definite=Cons|Number=Sing 2 nsubj 2:nsubj Vform=وِكَالَةُ|Gloss=agency|Root=w_k_l|Translit=wikālatu|LTranslit=wikālat
+4 الانباء نَبَأ NOUN N------P2D Case=Gen|Definite=Def|Number=Plur 3 nmod 3:nmod:gen Vform=اَلأَنبَاءِ|Gloss=news_item,report|Root=n_b_'|Translit=al-ʾanbāʾi|LTranslit=nabaʾ
+5 السورية سُورِيّ ADJ A-----FS1D Case=Nom|Definite=Def|Gender=Fem|Number=Sing 3 amod 3:amod Vform=اَلسُّورِيَّةُ|Gloss=Syrian|Root=sUr|Translit=as-sūrīyatu|LTranslit=sūrīy
+6 ان أَنَّ SCONJ C--------- _ 8 mark 8:mark Vform=أَنَّ|Gloss=that|Root='_n|Translit=ʾanna|LTranslit=ʾanna
+7 التعديل تَعدِيل NOUN N------S4D Case=Acc|Definite=Def|Number=Sing 8 obl 8:obl:acc Vform=اَلتَّعدِيلَ|Gloss=adjustment,change,modification,amendment|Root=`_d_l|Translit=at-taʿdīla|LTranslit=taʿdīl
+8 شمل شَمِل VERB VP-A-3MS-- Aspect=Perf|Gender=Masc|Number=Sing|Person=3|Voice=Act 2 ccomp 2:ccomp Vform=شَمِلَ|Gloss=comprise,include,contain|Root=^s_m_l|Translit=šamila|LTranslit=šamil
+9 ثماني ثَمَانُون NUM QL------4R Case=Acc|Definite=Cons|NumForm=Word 8 obj 8:obj Vform=ثَمَانِي|Gloss=eighty|Root=_t_m_n|Translit=ṯamānī|LTranslit=ṯamānūn
+10 حقائب حَقِيبَة NOUN N------P2I Case=Gen|Definite=Ind|Number=Plur 9 nmod 9:nmod:gen Vform=حَقَائِبَ|Gloss=briefcase,suitcase,portfolio,luggage|Root=.h_q_b|Translit=ḥaqāʾiba|LTranslit=ḥaqībat
+11-12 بينها _ _ _ _ _ _ _ _
+11 بين بَينَ ADP PI------4- AdpType=Prep|Case=Acc 12 case 12:case Vform=بَينَ|Gloss=between,among|Root=b_y_n|Translit=bayna|LTranslit=bayna
+12 ها هُوَ PRON SP---3FS2- Case=Gen|Gender=Fem|Number=Sing|Person=3|PronType=Prs 10 obl 10:obl:بَينَ:gen Vform=هَا|Gloss=he,she,it|Translit=hā|LTranslit=huwa
+13 وزارتا وِزَارَة NOUN N------D1R Case=Nom|Definite=Cons|Number=Dual 12 nsubj 12:nsubj Vform=وِزَارَتَا|Gloss=ministry|Root=w_z_r|Translit=wizāratā|LTranslit=wizārat
+14 الداخلية دَاخِلِيّ ADJ A-----FS2D Case=Gen|Definite=Def|Gender=Fem|Number=Sing 13 amod 13:amod Vform=اَلدَّاخِلِيَّةِ|Gloss=internal,domestic,interior,of_state|Root=d__h_l|Translit=ad-dāḫilīyati|LTranslit=dāḫilīy
+15-16 والاقتصاد _ _ _ _ _ _ _ SpaceAfter=No
+15 و وَ CCONJ C--------- _ 16 cc 16:cc Vform=وَ|Gloss=and|Root=wa|Translit=wa|LTranslit=wa
+16 الاقتصاد اِقتِصَاد NOUN N------S2D Case=Gen|Definite=Def|Number=Sing 14 conj 13:amod|14:conj Vform=اَلِاقتِصَادِ|Gloss=economy,saving|Root=q_.s_d|Translit=al-i-ʼqtiṣādi|LTranslit=iqtiṣād
+17 . . PUNCT G--------- _ 1 punct 1:punct Vform=.|Translit=. \ No newline at end of file
diff --git a/demo/en_test.conllu.txt b/demo/en_test.conllu.txt
new file mode 100644
index 00000000..0dfb9d91
--- /dev/null
+++ b/demo/en_test.conllu.txt
@@ -0,0 +1,79 @@
+# newdoc id = weblog-blogspot.com_zentelligence_20040423000200_ENG_20040423_000200
+# sent_id = weblog-blogspot.com_zentelligence_20040423000200_ENG_20040423_000200-0001
+# newpar id = weblog-blogspot.com_zentelligence_20040423000200_ENG_20040423_000200-p0001
+# text = What if Google Morphed Into GoogleOS?
+1 What what PRON WP PronType=Int 0 root 0:root _
+2 if if SCONJ IN _ 4 mark 4:mark _
+3 Google Google PROPN NNP Number=Sing 4 nsubj 4:nsubj _
+4 Morphed morph VERB VBD Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin 1 advcl 1:advcl:if _
+5 Into into ADP IN _ 6 case 6:case _
+6 GoogleOS GoogleOS PROPN NNP Number=Sing 4 obl 4:obl:into SpaceAfter=No
+7 ? ? PUNCT . _ 4 punct 4:punct _
+
+# sent_id = weblog-blogspot.com_zentelligence_20040423000200_ENG_20040423_000200-0002
+# text = What if Google expanded on its search-engine (and now e-mail) wares into a full-fledged operating system?
+1 What what PRON WP PronType=Int 0 root 0:root _
+2 if if SCONJ IN _ 4 mark 4:mark _
+3 Google Google PROPN NNP Number=Sing 4 nsubj 4:nsubj _
+4 expanded expand VERB VBD Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin 1 advcl 1:advcl:if _
+5 on on ADP IN _ 15 case 15:case _
+6 its its PRON PRP$ Gender=Neut|Number=Sing|Person=3|Poss=Yes|PronType=Prs 15 nmod:poss 15:nmod:poss _
+7 search search NOUN NN Number=Sing 9 compound 9:compound SpaceAfter=No
+8 - - PUNCT HYPH _ 9 punct 9:punct SpaceAfter=No
+9 engine engine NOUN NN Number=Sing 15 compound 15:compound _
+10 ( ( PUNCT -LRB- _ 9 punct 9:punct SpaceAfter=No
+11 and and CCONJ CC _ 13 cc 13:cc _
+12 now now ADV RB _ 13 advmod 13:advmod _
+13 e-mail e-mail NOUN NN Number=Sing 9 conj 9:conj:and|15:compound SpaceAfter=No
+14 ) ) PUNCT -RRB- _ 15 punct 15:punct _
+15 wares wares NOUN NNS Number=Plur 4 obl 4:obl:on _
+16 into into ADP IN _ 22 case 22:case _
+17 a a DET DT Definite=Ind|PronType=Art 22 det 22:det _
+18 full full ADV RB _ 20 advmod 20:advmod SpaceAfter=No
+19 - - PUNCT HYPH _ 20 punct 20:punct SpaceAfter=No
+20 fledged fledged ADJ JJ Degree=Pos 22 amod 22:amod _
+21 operating operating NOUN NN Number=Sing 22 compound 22:compound _
+22 system system NOUN NN Number=Sing 4 obl 4:obl:into SpaceAfter=No
+23 ? ? PUNCT . _ 4 punct 4:punct _
+
+# sent_id = weblog-blogspot.com_zentelligence_20040423000200_ENG_20040423_000200-0003
+# text = [via Microsoft Watch from Mary Jo Foley ]
+1 [ [ PUNCT -LRB- _ 4 punct 4:punct SpaceAfter=No
+2 via via ADP IN _ 4 case 4:case _
+3 Microsoft Microsoft PROPN NNP Number=Sing 4 compound 4:compound _
+4 Watch Watch PROPN NNP Number=Sing 0 root 0:root _
+5 from from ADP IN _ 6 case 6:case _
+6 Mary Mary PROPN NNP Number=Sing 4 nmod 4:nmod:from _
+7 Jo Jo PROPN NNP Number=Sing 6 flat 6:flat _
+8 Foley Foley PROPN NNP Number=Sing 6 flat 6:flat _
+9 ] ] PUNCT -RRB- _ 4 punct 4:punct _
+
+# newdoc id = weblog-blogspot.com_marketview_20050511222700_ENG_20050511_222700
+# sent_id = weblog-blogspot.com_marketview_20050511222700_ENG_20050511_222700-0001
+# newpar id = weblog-blogspot.com_marketview_20050511222700_ENG_20050511_222700-p0001
+# text = (And, by the way, is anybody else just a little nostalgic for the days when that was a good thing?)
+1 ( ( PUNCT -LRB- _ 14 punct 14:punct SpaceAfter=No
+2 And and CCONJ CC _ 14 cc 14:cc SpaceAfter=No
+3 , , PUNCT , _ 14 punct 14:punct _
+4 by by ADP IN _ 6 case 6:case _
+5 the the DET DT Definite=Def|PronType=Art 6 det 6:det _
+6 way way NOUN NN Number=Sing 14 obl 14:obl:by SpaceAfter=No
+7 , , PUNCT , _ 14 punct 14:punct _
+8 is be AUX VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin 14 cop 14:cop _
+9 anybody anybody PRON NN Number=Sing 14 nsubj 14:nsubj _
+10 else else ADJ JJ Degree=Pos 9 amod 9:amod _
+11 just just ADV RB _ 13 advmod 13:advmod _
+12 a a DET DT Definite=Ind|PronType=Art 13 det 13:det _
+13 little little ADJ JJ Degree=Pos 14 obl:npmod 14:obl:npmod _
+14 nostalgic nostalgic NOUN NN Number=Sing 0 root 0:root _
+15 for for ADP IN _ 17 case 17:case _
+16 the the DET DT Definite=Def|PronType=Art 17 det 17:det _
+17 days day NOUN NNS Number=Plur 14 nmod 14:nmod:for|23:obl:npmod _
+18 when when ADV WRB PronType=Rel 23 advmod 17:ref _
+19 that that PRON DT Number=Sing|PronType=Dem 23 nsubj 23:nsubj _
+20 was be AUX VBD Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin 23 cop 23:cop _
+21 a a DET DT Definite=Ind|PronType=Art 23 det 23:det _
+22 good good ADJ JJ Degree=Pos 23 amod 23:amod _
+23 thing thing NOUN NN Number=Sing 17 acl:relcl 17:acl:relcl SpaceAfter=No
+24 ? ? PUNCT . _ 14 punct 14:punct SpaceAfter=No
+25 ) ) PUNCT -RRB- _ 14 punct 14:punct _ \ No newline at end of file
diff --git a/demo/japanese_test.conllu.txt b/demo/japanese_test.conllu.txt
new file mode 100644
index 00000000..af83e104
--- /dev/null
+++ b/demo/japanese_test.conllu.txt
@@ -0,0 +1,82 @@
+# newdoc id = test-s1
+# sent_id = test-s1
+# text = これに不快感を示す住民はいましたが,現在,表立って反対や抗議の声を挙げている住民はいないようです。
+1 これ 此れ PRON 代名詞 _ 6 obl _ BunsetuBILabel=B|BunsetuPositionType=SEM_HEAD|LUWBILabel=B|LUWPOS=代名詞|SpaceAfter=No|UnidicInfo=,此れ,これ,これ,コレ,,,コレ,コレ,此れ
+2 に に ADP 助詞-格助詞 _ 1 case _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=B|LUWPOS=助詞-格助詞|SpaceAfter=No|UnidicInfo=,に,に,に,ニ,,,ニ,ニ,に
+3 不快 不快 NOUN 名詞-普通名詞-形状詞可能 _ 4 compound _ BunsetuBILabel=B|BunsetuPositionType=CONT|LUWBILabel=B|LUWPOS=名詞-普通名詞-一般|SpaceAfter=No|UnidicInfo=,不快,不快,不快,フカイ,,,フカイ,フカイカン,不快感
+4 感 感 NOUN 名詞-普通名詞-一般 _ 6 obj _ BunsetuBILabel=I|BunsetuPositionType=SEM_HEAD|LUWBILabel=I|LUWPOS=名詞-普通名詞-一般|SpaceAfter=No|UnidicInfo=,感,感,感,カン,,,カン,フカイカン,不快感
+5 を を ADP 助詞-格助詞 _ 4 case _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=B|LUWPOS=助詞-格助詞|SpaceAfter=No|UnidicInfo=,を,を,を,オ,,,ヲ,ヲ,を
+6 示す 示す VERB 動詞-一般-五段-サ行 _ 7 acl _ BunsetuBILabel=B|BunsetuPositionType=SEM_HEAD|LUWBILabel=B|LUWPOS=動詞-一般-五段-サ行|SpaceAfter=No|UnidicInfo=,示す,示す,示す,シメス,,,シメス,シメス,示す
+7 住民 住民 NOUN 名詞-普通名詞-一般 _ 9 nsubj _ BunsetuBILabel=B|BunsetuPositionType=SEM_HEAD|LUWBILabel=B|LUWPOS=名詞-普通名詞-一般|SpaceAfter=No|UnidicInfo=,住民,住民,住民,ジューミン,,,ジュウミン,ジュウミン,住民
+8 は は ADP 助詞-係助詞 _ 7 case _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=B|LUWPOS=助詞-係助詞|SpaceAfter=No|UnidicInfo=,は,は,は,ワ,,,ハ,ハ,は
+9 い 居る VERB 動詞-非自立可能-上一段-ア行 _ 29 advcl _ BunsetuBILabel=B|BunsetuPositionType=SEM_HEAD|LUWBILabel=B|LUWPOS=動詞-一般-上一段-ア行|PrevUDLemma=いる|SpaceAfter=No|UnidicInfo=,居る,い,いる,イ,,,イル,イル,居る
+10 まし ます AUX 助動詞-助動詞-マス _ 9 aux _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=B|LUWPOS=助動詞-助動詞-マス|SpaceAfter=No|UnidicInfo=,ます,まし,ます,マシ,,,マス,マス,ます
+11 た た AUX 助動詞-助動詞-タ _ 9 aux _ BunsetuBILabel=I|BunsetuPositionType=FUNC|LUWBILabel=B|LUWPOS=助動詞-助動詞-タ|SpaceAfter=No|UnidicInfo=,た,た,た,タ,,,タ,タ,た
+12 が が SCONJ 助詞-接続助詞 _ 9 mark _ BunsetuBILabel=I|BunsetuPositionType=FUNC|LUWBILabel=B|LUWPOS=助詞-接続助詞|SpaceAfter=No|UnidicInfo=,が,が,が,ガ,,,ガ,ガ,が
+13 , , PUNCT 補助記号-読点 _ 9 punct _ BunsetuBILabel=I|BunsetuPositionType=CONT|LUWBILabel=B|LUWPOS=補助記号-読点|SpaceAfter=No|UnidicInfo=,,,,,,,,,,,
+14 現在 現在 ADV 名詞-普通名詞-副詞可能 _ 16 advmod _ BunsetuBILabel=B|BunsetuPositionType=SEM_HEAD|LUWBILabel=B|LUWPOS=副詞|SpaceAfter=No|UnidicInfo=,現在,現在,現在,ゲンザイ,,,ゲンザイ,ゲンザイ,現在
+15 , , PUNCT 補助記号-読点 _ 14 punct _ BunsetuBILabel=I|BunsetuPositionType=CONT|LUWBILabel=B|LUWPOS=補助記号-読点|SpaceAfter=No|UnidicInfo=,,,,,,,,,,,
+16 表立っ 表立つ VERB 動詞-一般-五段-タ行 _ 24 advcl _ BunsetuBILabel=B|BunsetuPositionType=SEM_HEAD|LUWBILabel=B|LUWPOS=動詞-一般-五段-タ行|SpaceAfter=No|UnidicInfo=,表立つ,表立っ,表立つ,オモテダッ,,,オモテダツ,オモテダツ,表立つ
+17 て て SCONJ 助詞-接続助詞 _ 16 mark _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=B|LUWPOS=助詞-接続助詞|SpaceAfter=No|UnidicInfo=,て,て,て,テ,,,テ,テ,て
+18 反対 反対 NOUN 名詞-普通名詞-サ変形状詞可能 _ 20 nmod _ BunsetuBILabel=B|BunsetuPositionType=SEM_HEAD|LUWBILabel=B|LUWPOS=名詞-普通名詞-一般|SpaceAfter=No|UnidicInfo=,反対,反対,反対,ハンタイ,,,ハンタイ,ハンタイ,反対
+19 や や ADP 助詞-副助詞 _ 18 case _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=B|LUWPOS=助詞-副助詞|SpaceAfter=No|UnidicInfo=,や,や,や,ヤ,,,ヤ,ヤ,や
+20 抗議 抗議 NOUN 名詞-普通名詞-サ変可能 _ 22 nmod _ BunsetuBILabel=B|BunsetuPositionType=SEM_HEAD|LUWBILabel=B|LUWPOS=名詞-普通名詞-一般|SpaceAfter=No|UnidicInfo=,抗議,抗議,抗議,コーギ,,,コウギ,コウギ,抗議
+21 の の ADP 助詞-格助詞 _ 20 case _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=B|LUWPOS=助詞-格助詞|SpaceAfter=No|UnidicInfo=,の,の,の,ノ,,,ノ,ノ,の
+22 声 声 NOUN 名詞-普通名詞-一般 _ 24 obj _ BunsetuBILabel=B|BunsetuPositionType=SEM_HEAD|LUWBILabel=B|LUWPOS=名詞-普通名詞-一般|SpaceAfter=No|UnidicInfo=,声,声,声,コエ,,,コエ,コエ,声
+23 を を ADP 助詞-格助詞 _ 22 case _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=B|LUWPOS=助詞-格助詞|SpaceAfter=No|UnidicInfo=,を,を,を,オ,,,ヲ,ヲ,を
+24 挙げ 上げる VERB 動詞-非自立可能-下一段-ガ行 _ 27 acl _ BunsetuBILabel=B|BunsetuPositionType=SEM_HEAD|LUWBILabel=B|LUWPOS=動詞-一般-下一段-ガ行|SpaceAfter=No|UnidicInfo=,上げる,挙げ,挙げる,アゲ,,,アゲル,アゲル,上げる
+25 て て SCONJ 助詞-接続助詞 _ 24 mark _ BunsetuBILabel=I|BunsetuPositionType=FUNC|LUWBILabel=B|LUWPOS=助動詞-上一段-ア行|SpaceAfter=No|UnidicInfo=,て,て,て,テ,,,テ,テイル,ている
+26 いる 居る VERB 動詞-非自立可能-上一段-ア行 _ 25 fixed _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=I|LUWPOS=助動詞-上一段-ア行|PrevUDLemma=いる|SpaceAfter=No|UnidicInfo=,居る,いる,いる,イル,,,イル,テイル,ている
+27 住民 住民 NOUN 名詞-普通名詞-一般 _ 29 nsubj _ BunsetuBILabel=B|BunsetuPositionType=SEM_HEAD|LUWBILabel=B|LUWPOS=名詞-普通名詞-一般|SpaceAfter=No|UnidicInfo=,住民,住民,住民,ジューミン,,,ジュウミン,ジュウミン,住民
+28 は は ADP 助詞-係助詞 _ 27 case _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=B|LUWPOS=助詞-係助詞|SpaceAfter=No|UnidicInfo=,は,は,は,ワ,,,ハ,ハ,は
+29 い 居る VERB 動詞-非自立可能-上一段-ア行 _ 0 root _ BunsetuBILabel=B|BunsetuPositionType=ROOT|LUWBILabel=B|LUWPOS=動詞-一般-上一段-ア行|PrevUDLemma=いる|SpaceAfter=No|UnidicInfo=,居る,い,いる,イ,,,イル,イル,居る
+30 ない ない AUX 助動詞-助動詞-ナイ Polarity=Neg 29 aux _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=B|LUWPOS=助動詞-助動詞-ナイ|SpaceAfter=No|UnidicInfo=,ない,ない,ない,ナイ,,,ナイ,ナイ,ない
+31 よう 様 AUX 形状詞-助動詞語幹 _ 29 aux _ BunsetuBILabel=I|BunsetuPositionType=CONT|LUWBILabel=B|LUWPOS=形状詞-助動詞語幹|PrevUDLemma=よう|SpaceAfter=No|UnidicInfo=,様,よう,よう,ヨー,,,ヨウ,ヨウ,様
+32 です です AUX 助動詞-助動詞-デス _ 29 aux _ BunsetuBILabel=I|BunsetuPositionType=FUNC|LUWBILabel=B|LUWPOS=助動詞-助動詞-デス|PrevUDLemma=だ|SpaceAfter=No|UnidicInfo=,です,です,です,デス,,,デス,デス,です
+33 。 。 PUNCT 補助記号-句点 _ 29 punct _ BunsetuBILabel=I|BunsetuPositionType=CONT|LUWBILabel=B|LUWPOS=補助記号-句点|SpaceAfter=Yes|UnidicInfo=,。,。,。,,,,,,。
+
+# newdoc id = test-s2
+# sent_id = test-s2
+# text = 幸福の科学側からは,特にどうしてほしいという要望はいただいていません。
+1 幸福 幸福 NOUN 名詞-普通名詞-形状詞可能 _ 4 nmod _ BunsetuBILabel=B|BunsetuPositionType=SEM_HEAD|LUWBILabel=B|LUWPOS=名詞-普通名詞-一般|SpaceAfter=No|UnidicInfo=,幸福,幸福,幸福,コーフク,,,コウフク,コウフクノカガクガワ,幸福の科学側
+2 の の ADP 助詞-格助詞 _ 1 case _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=I|LUWPOS=名詞-普通名詞-一般|SpaceAfter=No|UnidicInfo=,の,の,の,ノ,,,ノ,コウフクノカガクガワ,幸福の科学側
+3 科学 科学 NOUN 名詞-普通名詞-サ変可能 _ 4 compound _ BunsetuBILabel=B|BunsetuPositionType=CONT|LUWBILabel=I|LUWPOS=名詞-普通名詞-一般|SpaceAfter=No|UnidicInfo=,科学,科学,科学,カガク,,,カガク,コウフクノカガクガワ,幸福の科学側
+4 側 側 NOUN 名詞-普通名詞-一般 _ 17 obl _ BunsetuBILabel=I|BunsetuPositionType=SEM_HEAD|LUWBILabel=I|LUWPOS=名詞-普通名詞-一般|SpaceAfter=No|UnidicInfo=,側,側,側,ガワ,,,ガワ,コウフクノカガクガワ,幸福の科学側
+5 から から ADP 助詞-格助詞 _ 4 case _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=B|LUWPOS=助詞-格助詞|SpaceAfter=No|UnidicInfo=,から,から,から,カラ,,,カラ,カラ,から
+6 は は ADP 助詞-係助詞 _ 4 case _ BunsetuBILabel=I|BunsetuPositionType=FUNC|LUWBILabel=B|LUWPOS=助詞-係助詞|SpaceAfter=No|UnidicInfo=,は,は,は,ワ,,,ハ,ハ,は
+7 , , PUNCT 補助記号-読点 _ 4 punct _ BunsetuBILabel=I|BunsetuPositionType=CONT|LUWBILabel=B|LUWPOS=補助記号-読点|SpaceAfter=No|UnidicInfo=,,,,,,,,,,,
+8 特に 特に ADV 副詞 _ 17 advmod _ BunsetuBILabel=B|BunsetuPositionType=SEM_HEAD|LUWBILabel=B|LUWPOS=副詞|SpaceAfter=No|UnidicInfo=,特に,特に,特に,トクニ,,,トクニ,トクニ,特に
+9 どう どう ADV 副詞 _ 15 advcl _ BunsetuBILabel=B|BunsetuPositionType=SEM_HEAD|LUWBILabel=B|LUWPOS=動詞-一般-サ行変格|SpaceAfter=No|UnidicInfo=,どう,どう,どう,ドー,,,ドウ,ドウスル,どうする
+10 し 為る AUX 動詞-非自立可能-サ行変格 _ 9 aux _ BunsetuBILabel=I|BunsetuPositionType=FUNC|LUWBILabel=I|LUWPOS=動詞-一般-サ行変格|PrevUDLemma=する|SpaceAfter=No|UnidicInfo=,為る,し,する,シ,,,スル,ドウスル,どうする
+11 て て SCONJ 助詞-接続助詞 _ 9 mark _ BunsetuBILabel=I|BunsetuPositionType=FUNC|LUWBILabel=B|LUWPOS=助動詞-形容詞|SpaceAfter=No|UnidicInfo=,て,て,て,テ,,,テ,テホシイ,てほしい
+12 ほしい 欲しい AUX 形容詞-非自立可能-形容詞 _ 11 fixed _ BunsetuBILabel=I|BunsetuPositionType=FUNC|LUWBILabel=I|LUWPOS=助動詞-形容詞|PrevUDLemma=ほしい|SpaceAfter=No|UnidicInfo=,欲しい,ほしい,ほしい,ホシー,,,ホシイ,テホシイ,てほしい
+13 と と ADP 助詞-格助詞 _ 9 case _ BunsetuBILabel=I|BunsetuPositionType=FUNC|LUWBILabel=B|LUWPOS=助詞-格助詞|SpaceAfter=No|UnidicInfo=,と,と,と,ト,,,ト,トイウ,という
+14 いう 言う VERB 動詞-一般-五段-ワア行 _ 13 fixed _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=I|LUWPOS=助詞-格助詞|SpaceAfter=No|UnidicInfo=,言う,いう,いう,イウ,,,イウ,トイウ,という
+15 要望 要望 NOUN 名詞-普通名詞-サ変可能 _ 17 nsubj _ BunsetuBILabel=B|BunsetuPositionType=SEM_HEAD|LUWBILabel=B|LUWPOS=名詞-普通名詞-一般|SpaceAfter=No|UnidicInfo=,要望,要望,要望,ヨーボー,,,ヨウボウ,ヨウボウ,要望
+16 は は ADP 助詞-係助詞 _ 15 case _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=B|LUWPOS=助詞-係助詞|SpaceAfter=No|UnidicInfo=,は,は,は,ワ,,,ハ,ハ,は
+17 いただい 頂く VERB 動詞-非自立可能-五段-カ行 _ 0 root _ BunsetuBILabel=B|BunsetuPositionType=ROOT|LUWBILabel=B|LUWPOS=動詞-一般-五段-カ行|PrevUDLemma=いただく|SpaceAfter=No|UnidicInfo=,頂く,いただい,いただく,イタダイ,,,イタダク,イタダク,頂く
+18 て て SCONJ 助詞-接続助詞 _ 17 mark _ BunsetuBILabel=I|BunsetuPositionType=FUNC|LUWBILabel=B|LUWPOS=助動詞-上一段-ア行|SpaceAfter=No|UnidicInfo=,て,て,て,テ,,,テ,テイル,ている
+19 い 居る VERB 動詞-非自立可能-上一段-ア行 _ 18 fixed _ BunsetuBILabel=I|BunsetuPositionType=FUNC|LUWBILabel=I|LUWPOS=助動詞-上一段-ア行|PrevUDLemma=いる|SpaceAfter=No|UnidicInfo=,居る,い,いる,イ,,,イル,テイル,ている
+20 ませ ます AUX 助動詞-助動詞-マス _ 17 aux _ BunsetuBILabel=I|BunsetuPositionType=FUNC|LUWBILabel=B|LUWPOS=助動詞-助動詞-マス|SpaceAfter=No|UnidicInfo=,ます,ませ,ます,マセ,,,マス,マス,ます
+21 ん ず AUX 助動詞-助動詞-ヌ Polarity=Neg 17 aux _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=B|LUWPOS=助動詞-助動詞-ヌ|PrevUDLemma=ぬ|SpaceAfter=No|UnidicInfo=,ず,ん,ぬ,ン,,,ヌ,ズ,ず
+22 。 。 PUNCT 補助記号-句点 _ 17 punct _ BunsetuBILabel=I|BunsetuPositionType=CONT|LUWBILabel=B|LUWPOS=補助記号-句点|SpaceAfter=Yes|UnidicInfo=,。,。,。,,,,,,。
+
+# newdoc id = test-s3
+# sent_id = test-s3
+# text = 星取り参加は当然とされ,不参加は白眼視される。
+1 星取り 星取り NOUN 名詞-普通名詞-一般 _ 2 compound _ BunsetuBILabel=B|BunsetuPositionType=CONT|LUWBILabel=B|LUWPOS=名詞-普通名詞-一般|SpaceAfter=No|UnidicInfo=,星取り,星取り,星取り,ホシトリ,,,ホシトリ,ホシトリサンカ,星取り参加
+2 参加 参加 NOUN 名詞-普通名詞-サ変可能 _ 4 nsubj _ BunsetuBILabel=I|BunsetuPositionType=SEM_HEAD|LUWBILabel=I|LUWPOS=名詞-普通名詞-一般|SpaceAfter=No|UnidicInfo=,参加,参加,参加,サンカ,,,サンカ,ホシトリサンカ,星取り参加
+3 は は ADP 助詞-係助詞 _ 2 case _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=B|LUWPOS=助詞-係助詞|SpaceAfter=No|UnidicInfo=,は,は,は,ワ,,,ハ,ハ,は
+4 当然 当然 ADJ 形状詞-一般 _ 6 advcl _ BunsetuBILabel=B|BunsetuPositionType=SEM_HEAD|LUWBILabel=B|LUWPOS=形状詞-一般|SpaceAfter=No|UnidicInfo=,当然,当然,当然,トーゼン,,,トウゼン,トウゼン,当然
+5 と と ADP 助詞-格助詞 _ 4 case _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=B|LUWPOS=助詞-格助詞|SpaceAfter=No|UnidicInfo=,と,と,と,ト,,,ト,ト,と
+6 さ 為る VERB 動詞-非自立可能-サ行変格 _ 13 acl _ BunsetuBILabel=B|BunsetuPositionType=SEM_HEAD|LUWBILabel=B|LUWPOS=動詞-一般-サ行変格|PrevUDLemma=する|SpaceAfter=No|UnidicInfo=,為る,さ,する,サ,,,スル,スル,する
+7 れ れる AUX 助動詞-助動詞-レル _ 6 aux _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=B|LUWPOS=助動詞-助動詞-レル|SpaceAfter=No|UnidicInfo=,れる,れ,れる,レ,,,レル,レル,れる
+8 , , PUNCT 補助記号-読点 _ 6 punct _ BunsetuBILabel=I|BunsetuPositionType=CONT|LUWBILabel=B|LUWPOS=補助記号-読点|SpaceAfter=No|UnidicInfo=,,,,,,,,,,,
+9 不 不 NOUN 接頭辞 Polarity=Neg 10 compound _ BunsetuBILabel=B|BunsetuPositionType=CONT|LUWBILabel=B|LUWPOS=名詞-普通名詞-一般|SpaceAfter=No|UnidicInfo=,不,不,不,フ,,,フ,フサンカ,不参加
+10 参加 参加 NOUN 名詞-普通名詞-サ変可能 _ 13 nsubj _ BunsetuBILabel=I|BunsetuPositionType=SEM_HEAD|LUWBILabel=I|LUWPOS=名詞-普通名詞-一般|SpaceAfter=No|UnidicInfo=,参加,参加,参加,サンカ,,,サンカ,フサンカ,不参加
+11 は は ADP 助詞-係助詞 _ 10 case _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=B|LUWPOS=助詞-係助詞|SpaceAfter=No|UnidicInfo=,は,は,は,ワ,,,ハ,ハ,は
+12 白眼 白眼 NOUN 名詞-普通名詞-一般 _ 13 compound _ BunsetuBILabel=B|BunsetuPositionType=CONT|LUWBILabel=B|LUWPOS=動詞-一般-サ行変格|SpaceAfter=No|UnidicInfo=,白眼,白眼,白眼,ハクガン,,,ハクガン,ハクガンシスル,白眼視する
+13 視 視 NOUN 接尾辞-名詞的-サ変可能 _ 0 root _ BunsetuBILabel=I|BunsetuPositionType=ROOT|LUWBILabel=I|LUWPOS=動詞-一般-サ行変格|SpaceAfter=No|UnidicInfo=,視,視,視,シ,,,シ,ハクガンシスル,白眼視する
+14 さ 為る AUX 動詞-非自立可能-サ行変格 _ 13 aux _ BunsetuBILabel=I|BunsetuPositionType=FUNC|LUWBILabel=I|LUWPOS=動詞-一般-サ行変格|PrevUDLemma=する|SpaceAfter=No|UnidicInfo=,為る,さ,する,サ,,,スル,ハクガンシスル,白眼視する
+15 れる れる AUX 助動詞-助動詞-レル _ 13 aux _ BunsetuBILabel=I|BunsetuPositionType=SYN_HEAD|LUWBILabel=B|LUWPOS=助動詞-助動詞-レル|SpaceAfter=No|UnidicInfo=,れる,れる,れる,レル,,,レル,レル,れる
+16 。 。 PUNCT 補助記号-句点 _ 13 punct _ BunsetuBILabel=I|BunsetuPositionType=CONT|LUWBILabel=B|LUWPOS=補助記号-句点|SpaceAfter=Yes|UnidicInfo=,。,。,。,,,,,,。 \ No newline at end of file