文字分割

出自維基百科,自由嘅百科全書

文字分割英文text segmentation)係指將要處理嘅一段字分割做若干嚿各自有意思嘅單位,方便做一步嘅分析或者其他處理。常見嘅有將段字切割做句子或者個別嘅呀噉。

即係例如[1]

Input:San Pedro is a town on the southern part of the island of Ambergris Caye in the Belize District of the nation of Belize, in Central America. According to 2015 mid-year estimates, the town has a population of about 16, 444.
Output
San Pedro is a town on the southern part of the island of Ambergris Caye in the 2.Belize District of the nation of Belize, in Central America.
According to 2015 mid-year estimates, the town has a population of about 16, 444. It is the second-largest town in the Belize District and largest in the Belize Rural South constituency.(分割咗做唔同句子)

睇埋[編輯]

[編輯]

  1. Freddy Y. Y. Choi (2000). "Advances in domain independent linear text segmentation". Proceedings of the 1st Meeting of the North American Chapter of the Association for Computational Linguistics (ANLP-NAACL-00). pp. 26–33.