US7657423B1 - Automatic completion of fragments of text - Google Patents

Automatic completion of fragments of text Download PDF

Info

Publication number
US7657423B1
US7657423B1 US10/697,333 US69733303A US7657423B1 US 7657423 B1 US7657423 B1 US 7657423B1 US 69733303 A US69733303 A US 69733303A US 7657423 B1 US7657423 B1 US 7657423B1
Authority
US
United States
Prior art keywords
sentence
text
endings
text fragment
sentences
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US10/697,333
Inventor
Georges R. Harik
Simon Tong
David R. Cheng
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Google LLC
Original Assignee
Google LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Google LLC filed Critical Google LLC
Priority to US10/697,333 priority Critical patent/US7657423B1/en
Assigned to GOOGLE INC. reassignment GOOGLE INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: HARIK, GEORGES R., TONG, SIMON, CHENG, DAVID R.
Priority to US12/636,926 priority patent/US8024178B1/en
Application granted granted Critical
Publication of US7657423B1 publication Critical patent/US7657423B1/en
Priority to US13/235,025 priority patent/US8280722B1/en
Priority to US13/598,089 priority patent/US8521515B1/en
Assigned to GOOGLE LLC reassignment GOOGLE LLC CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: GOOGLE INC.
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/274Converting codes to words; Guess-ahead of partial word inputs

Definitions

  • the present invention relates generally to information retrieval systems and, more particularly, to systems and methods for automatically completing fragments of text (e.g., sentences or paragraphs).
  • a method for completing fragments of text may include obtaining a text fragment and performing a search, based at least in part on the text fragment, to identify one or more documents.
  • the method may also include identifying sentences within the one or more documents that are associated with the text fragment, determining sentence endings associated with the identified sentences, and presenting the sentence endings as potential completions for the text fragment.
  • a computer device includes a memory configured to store code and a processor configured to execute the code in the memory.
  • the code in the memory may include document preparation code and assistant code.
  • the document preparation code is configured to permit a user to prepare or edit a document.
  • the assistant code is configured to detect a fragment of text within the document, obtain potential sentence completions for the fragment of text, and present the potential sentence completions to the user.
  • a computer device includes a memory configured to store instructions and a processor configured to execute the instructions in the memory.
  • the processor may obtain a fragment of text and search for local documents that include at least a portion of the fragment of text.
  • the processor may identify sentences within the local documents that are associated with the fragment of text, determine sentence completions associated with the located sentences, and provide the sentence completions as potential completions for the fragment of text.
  • FIG. 1 is a diagram of an exemplary network in which systems and methods consistent with the principles of the invention may be implemented;
  • FIG. 2 is an exemplary diagram of a client and/or server of FIG. 1 in an implementation consistent with the principles of the invention
  • FIGS. 3A and 3B are flowcharts of exemplary processing for automatically completing a fragment of text according to an implementation consistent with the principles of the invention.
  • FIG. 4 is a diagram of an exemplary ranked list according to an implementation consistent with the principles of the invention.
  • Systems and methods consistent with the principles of the invention may automatically complete a fragment of text, such as a sentence or paragraph.
  • the systems and methods may identify possible endings from documents, such as web documents, and provide these endings as possible completions for the fragment of text.
  • FIG. 1 is an exemplary diagram of a network 100 in which systems and methods consistent with the principles of the invention may be implemented.
  • Network 100 may include multiple clients 110 connected to multiple servers 120 - 140 via a network 150 .
  • Network 150 may include a local area network (LAN), a wide area network (WAN), a telephone network, such as the Public Switched Telephone Network (PSTN), an intranet, the Internet, a memory device, another type of network, or a combination of networks.
  • PSTN Public Switched Telephone Network
  • Clients 110 may include client entities.
  • An entity may be defined as a device, such as a wireless telephone, a personal computer, a personal digital assistant (PDA), a lap top, or another type of computation or communication device, a thread or process running on one of these devices, and/or an object executable by one of these device.
  • Servers 120 - 140 may include server entities that gather, process, search, and/or maintain documents in a manner consistent with the principles of the invention.
  • Clients 110 and servers 120 - 140 may connect to network 150 via wired, wireless, and/or optical connections.
  • server 120 may optionally include a search engine 125 usable by clients 110 .
  • Server 120 may crawl a corpus of documents (e.g., web pages) and store information associated with these documents in a repository of crawled documents.
  • Servers 130 and 140 may store or maintain documents that may be crawled by server 120 .
  • servers 120 - 140 are shown as separate entities, it may be possible for one or more of servers 120 - 140 to perform one or more of the functions of another one or more of servers 120 - 140 .
  • two or more of servers 120 - 140 are implemented as a single server. It may also be possible for a single one of servers 120 - 140 to be implemented as two or more separate (and possibly distributed) devices.
  • FIG. 2 is an exemplary diagram of a client or server entity (hereinafter called “client/server entity”), which may correspond to one or more of clients 110 and servers 120 - 140 , according to an implementation consistent with the principles of the invention.
  • the client/server entity may include a bus 210 , a processor 220 , a main memory 230 , a read only memory (ROM) 240 , a storage device 250 , one or more input devices 260 , one or more output devices 270 , and a communication interface 280 .
  • Bus 210 may include one or more conductors that permit communication among the components of the client/server entity.
  • Processor 220 may include any type of conventional processor or microprocessor that interprets and executes instructions.
  • Main memory 230 may include a random access memory (RAM) or another type of dynamic storage device that stores information and instructions for execution by processor 220 .
  • ROM 240 may include a conventional ROM device or another type of static storage device that stores static information and instructions for use by processor 220 .
  • Storage device 250 may include a magnetic and/or optical recording medium and its corresponding drive.
  • Input device(s) 260 may include one or more conventional mechanisms that permit an operator to input information to the client/server entity, such as a keyboard, a mouse, a pen, voice recognition and/or biometric mechanisms, etc.
  • Output device(s) 270 may include one or more conventional mechanisms that output information to the operator, including a display, a printer, a speaker, etc.
  • Communication interface 280 may include any transceiver-like mechanism that enables the client/server entity to communicate with other devices and/or systems.
  • communication interface 280 may include mechanisms for communicating with another device or system via a network, such as network 150 .
  • the client/server entity perform certain searching-related operations.
  • the client/server entity may perform these operations in response to processor 220 executing software instructions contained in a computer-readable medium, such as memory 230 .
  • a computer-readable medium may be defined as one or more physical or logical memory devices and/or carrier waves.
  • the software instructions may be read into memory 230 from another computer-readable medium, such as data storage device 250 , or from another device via communication interface 280 .
  • the software instructions contained in memory 230 causes processor 220 to perform processes that will be described later.
  • hardwired circuitry may be used in place of or in combination with software instructions to implement processes consistent with the principles of the invention.
  • implementations consistent with the principles of the invention are not limited to any specific combination of hardware circuitry and software.
  • FIGS. 3A and 3B are flowcharts of exemplary processing for automatically completing fragments of text, such as sentences and paragraphs, according to an implementation consistent with the principles of the invention.
  • Processing may begin with server 120 receiving a search query from a user (act 310 ) ( FIG. 3A ).
  • a user may use conventional web browser software on client 110 to access search engine 125 of server 120 .
  • the user may then enter the search query via a graphical user interface provided by server 120 .
  • the search query may take different forms, such as a fragment of text.
  • the text fragment may be associated with a partial sentence, such as “Jane, I have to go because.”
  • the text fragment may be associated with a partial paragraph, such as “Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived, and so dedicated, can long endure. We are met on a great battle field of that war.” While the description to follow will be described mainly in terms of completing sentences, the description is equally applicable to completing paragraphs.
  • Server 120 may perform a search for documents that contain the search query and retrieve the search results (act 320 ). For example, server 120 may search a corpus or repository of documents to identify documents that include the text fragment of the search query as a phrase. In another implementation, server 120 may search for documents that also include synonyms of the word(s) in the search query. In either case, the documents may include documents stored by one or more servers, such as servers 120 - 140 . Server 120 may optionally cap the number of documents included in the search results (e.g., server 120 may retrieve the top 100 documents). For each of these documents, server 120 may retrieve its title and text.
  • Server 120 may then determine whether there are sufficient search results (act 330 ). For example, server 120 may compare the number of search results retrieved with a threshold (e.g., five). When the number of search results is less than the threshold, the search results may not be adequate to satisfy the search query provided by the user. In this case, server 120 may form a shortened search query (act 340 ). For example, server 120 may drop one or more words from the search query.
  • a threshold e.g., five
  • server 120 may simply drop one or more words from the beginning or end of the search query.
  • server 120 may drop one or more words based on one or more symbols, such as a comma, semicolon, bracket, backslash, etc., contained in the search query. For example, if the search query includes a comma, then server 120 may drop everything before or after the comma. Server 120 may perform similar functions based on other symbols.
  • server 120 may analyze the structure of the search query to more intelligently drop one or more words. For example, server 120 may use a parse tree to identify parts of the search query. Server 120 may then drop one or more of these parts. In the sentence example provided above, server 120 may shorten the search query to “I have to go because,” dropping “Jane,” from the search query.
  • Server 120 may then perform a search for documents that contain the shortened search query and retrieve the search results (act 320 ). As described above, server 120 may search a corpus or repository of documents to identify documents that include the shortened search query as a phrase. Server 120 may then again determine whether there are sufficient search results (act 330 ).
  • server 120 may scan the text of the documents in the search results to identify sentences that contain the search query (act 350 ).
  • Server 120 may optionally locate periods within the documents to identify candidate sentences and then identify which of the candidate sentences include the search query.
  • the search query may be included at the beginning or elsewhere within the identified sentences.
  • Server 120 may give preference to a sentence that includes the search query at the beginning of the sentence over sentences where the search query occurs elsewhere.
  • Server 120 may optionally discard sentences where the search query occurs more than once within the same sentences.
  • server 120 may search left and right to determine the rough boundaries of the sentence containing the search query. For example, server 120 may look for periods (or other forms of punctuation) that typically precede and end a sentence. Server 120 may be programmed to ignore other typical occurrences of periods (and other forms of punctuation), such as when periods are used for initials, abbreviations, etc. Server 120 may optionally discard sentences that are missing punctuation and sentences that do not make sense (e.g., do not contain proper sentence structure).
  • Server 120 may then determine the sentence endings (also called “completions”) associated with the identified sentences (act 360 ) ( FIG. 3B ). For example, server 120 may identify the word(s) that follow the text fragment of the search query until the end of the sentence. Server 120 may define a quality sentence ending as one that “ends properly,” where “ends properly” is defined as: (1) the word(s) at the end make a better end of a sentence than they do a beginning of a sentence (e.g., year and pen); and (2) the last word is not in a list of bad endings (which may be maintained by server 120 ) (e.g., vs, dr, and aug).
  • IDF inverse document frequency
  • Server 120 may optionally trim and/or merge the sentence endings (act 370 ).
  • server 120 may consider the text and symbols included in the sentence ending. For example, server 120 may compare text of the sentence ending to entries in the start and end IDF tables to determine whether to cut the text. Server 120 may also consider symbols, such as a comma, semicolon, bracket, backslash, etc., when identifying what text to cut.
  • server 120 may treat the dash separately, considering the text until the dash as a substring and ignoring the text after the dash. Server 120 may also disregard entire sentence endings that contain a colon (to avoid noise from message postings).
  • Single word sentence endings may be considered when the word is significant (e.g., it is a common ending in the end IDF table). Based on the foregoing, server 120 may further consider a sentence ending that: (1) ends properly; and (2) does not separate a preposition (or possessive) from its object.
  • server 120 may search for sentence endings that overlap (i.e., sentence endings that have one or more words in common). Sentence endings may be merged based on their common parts. When merging sentence endings, server 120 may permit some small differences between them. For example, the sentence endings “has four legs and has a tail and barks” and “has four legs and a tail” may be merged to “has four legs and a tail.”
  • Server 120 may optionally score the sentence endings (act 380 ). For example, server 120 may score the sentence endings by popularity. In other words, sentence endings that occur more often in the documents retrieved by the search may be scored higher than sentence endings that do not occur as often. Server 120 may alternatively, or additionally, score the sentence endings based on where the text fragment of the search query occurs within the identified sentences. In other words, the sentence endings corresponding to sentences where the text fragment of the search query occurs at the beginning of the sentences may be scored higher than sentence endings corresponding to sentences where the text fragment occurs elsewhere within the sentences. Server 120 may also penalize sentence endings for being too long, decreasing their scores. Server 120 may separately consider all of the sentence endings that were used to create a merged sentence ending when determining the score of that sentence ending.
  • Server 120 may present the sentence endings to the user (act 390 ). If the sentence endings were scored in some manner, server 120 may organize the sentence endings into a ranked list that it may provide to the user. In one implementation, server 120 may present an initial group of sentence endings to the user. The user may then be permitted to cycle through subsequent groups in a conventional manner.
  • FIG. 4 is a diagram of an exemplary ranked list 400 according to an implementation consistent with the principles of the invention.
  • the exemplary ranked list 400 may include ranked items that each include a score 410 and a sentence ending (or “completion”) 420 .
  • the user has provide a partial sentence of “I need to go now because.”
  • Server 120 provided various sentence endings that complete the partial sentence.
  • the top-ranked sentence ending is “I have to get up early tomorrow.”
  • server 120 may provide sentence endings via a different interface.
  • server 120 may operate in conjunction with an application, such as a word processing application, an instant messenger application, an e-mail application, or another type of application via which documents (including messages) are prepared or edited.
  • a server assistant which may be in the form of executable code, such as a plug-in, an applet, a dynamic link library (DLL), or a similar type of executable object or process, resident on client 110 , may operate to obtain the sentence endings from server 120 .
  • the server assistant may notice text fragments that may require completion and communicate with server 120 to obtain the sentence endings.
  • the server assistant may “notice” the text fragments by detecting them automatically to obtain the sentence endings on-the-fly or by detecting them when instructed by the user.
  • the server assistant may automatically insert one of the sentence endings at the location of the user's cursor. For example, if the user types “I need to go because” and presses a special key, the server assistant may complete the sentence by automatically inserting one of the sentence endings. The user may then be permitted to view other possible sentence endings by pressing the special key again. Alternatively, subsequent sentence endings may be automatically presented after expiration of a possibly user-configurable amount of time. According to another implementation, the server assistant may present the sentence endings via a pop-up window, another type of interface, or a combination of interfaces (e.g., a first possible sentence ending may be automatically inserted, but subsequent sentence endings may be presented via a pop-up window).
  • Systems and methods consistent with the principles of the invention may automatically complete a fragment of text, such as a sentence or paragraph.
  • the systems and methods may identify possible endings from text in web documents.
  • server 120 may provide a separate interface for paragraph completion. In another implementation, server 120 may provide the same interface for sentence and paragraph completion. When searching for paragraph endings, server 120 may also look for synonyms of the words provided in the search query. Server 120 may provide paragraph endings separately from or along with sentence endings. For example, server 120 may score the paragraph endings and the sentence endings and rank them based on their scores. It may be possible for server 120 to provide paragraph endings instead of sentence endings when server 120 finds no (or very few) good sentence endings for the search query.
  • server 120 performs most, if not all, of the acts described with regard to the processing of FIGS. 3A and 3B .
  • one or more, or all, of the acts may be performed by client 110 .
  • client 110 may obtain a text fragment and search documents local to client 110 (e.g., documents stored by client 110 and/or documents stored by a database accessible by client 110 ) to identify one or more documents that contain the text fragment. From these documents, client 110 may then identify potential sentence completions for the text fragment.

Abstract

A system offers potential completions for fragments of text. The system may obtain a text fragment and identify documents that include the text fragment. The system may locate sentences within the documents that include at least a portion of the text fragment, identify sentence endings associated with the located sentences, and present the sentence endings as potential completions for the text fragment.

Description

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to information retrieval systems and, more particularly, to systems and methods for automatically completing fragments of text (e.g., sentences or paragraphs).
2. Description of Related Art
Oftentimes, people have trouble completing sentences and/or paragraphs. They know what they want to say but they cannot find the appropriate words to say it. These people may find it beneficial to be offered possible completions for sentences and/or paragraphs.
Accordingly, there exists a need for mechanisms that provide possible completions for fragments of text, such as partial sentences and/or paragraphs.
SUMMARY OF THE INVENTION
Systems and methods, consistent with the principles of the invention, automatically complete fragments of text, such as sentences or paragraphs.
According to one aspect consistent with the principles of the invention, a method for completing fragments of text is provided. The method may include obtaining a text fragment and performing a search, based at least in part on the text fragment, to identify one or more documents. The method may also include identifying sentences within the one or more documents that are associated with the text fragment, determining sentence endings associated with the identified sentences, and presenting the sentence endings as potential completions for the text fragment.
According to another aspect, a computer device includes a memory configured to store code and a processor configured to execute the code in the memory. The code in the memory may include document preparation code and assistant code. The document preparation code is configured to permit a user to prepare or edit a document. The assistant code is configured to detect a fragment of text within the document, obtain potential sentence completions for the fragment of text, and present the potential sentence completions to the user.
According to a further aspect, a computer device includes a memory configured to store instructions and a processor configured to execute the instructions in the memory. The processor may obtain a fragment of text and search for local documents that include at least a portion of the fragment of text. The processor may identify sentences within the local documents that are associated with the fragment of text, determine sentence completions associated with the located sentences, and provide the sentence completions as potential completions for the fragment of text.
BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate an embodiment of the invention and, together with the description, explain the invention. In the drawings,
FIG. 1 is a diagram of an exemplary network in which systems and methods consistent with the principles of the invention may be implemented;
FIG. 2 is an exemplary diagram of a client and/or server of FIG. 1 in an implementation consistent with the principles of the invention;
FIGS. 3A and 3B are flowcharts of exemplary processing for automatically completing a fragment of text according to an implementation consistent with the principles of the invention; and
FIG. 4 is a diagram of an exemplary ranked list according to an implementation consistent with the principles of the invention.
DETAILED DESCRIPTION
The following detailed description of the invention refers to the accompanying drawings. The same reference numbers in different drawings may identify the same or similar elements. Also, the following detailed description does not limit the invention.
Systems and methods consistent with the principles of the invention may automatically complete a fragment of text, such as a sentence or paragraph. The systems and methods may identify possible endings from documents, such as web documents, and provide these endings as possible completions for the fragment of text.
Exemplary Network Configuration
FIG. 1 is an exemplary diagram of a network 100 in which systems and methods consistent with the principles of the invention may be implemented. Network 100 may include multiple clients 110 connected to multiple servers 120-140 via a network 150. Network 150 may include a local area network (LAN), a wide area network (WAN), a telephone network, such as the Public Switched Telephone Network (PSTN), an intranet, the Internet, a memory device, another type of network, or a combination of networks. Two clients 110 and three servers 120-140 have been illustrated as connected to network 150 for simplicity. In practice, there may be more or fewer clients and servers. Also, in some instances, a client may perform the functions of a server and a server may perform the functions of a client.
Clients 110 may include client entities. An entity may be defined as a device, such as a wireless telephone, a personal computer, a personal digital assistant (PDA), a lap top, or another type of computation or communication device, a thread or process running on one of these devices, and/or an object executable by one of these device. Servers 120-140 may include server entities that gather, process, search, and/or maintain documents in a manner consistent with the principles of the invention. Clients 110 and servers 120-140 may connect to network 150 via wired, wireless, and/or optical connections.
In an implementation consistent with the principles of the invention, server 120 may optionally include a search engine 125 usable by clients 110. Server 120 may crawl a corpus of documents (e.g., web pages) and store information associated with these documents in a repository of crawled documents. Servers 130 and 140 may store or maintain documents that may be crawled by server 120. While servers 120-140 are shown as separate entities, it may be possible for one or more of servers 120-140 to perform one or more of the functions of another one or more of servers 120-140. For example, it may be possible that two or more of servers 120-140 are implemented as a single server. It may also be possible for a single one of servers 120-140 to be implemented as two or more separate (and possibly distributed) devices.
Exemplary Client/Server Architecture
FIG. 2 is an exemplary diagram of a client or server entity (hereinafter called “client/server entity”), which may correspond to one or more of clients 110 and servers 120-140, according to an implementation consistent with the principles of the invention. The client/server entity may include a bus 210, a processor 220, a main memory 230, a read only memory (ROM) 240, a storage device 250, one or more input devices 260, one or more output devices 270, and a communication interface 280. Bus 210 may include one or more conductors that permit communication among the components of the client/server entity.
Processor 220 may include any type of conventional processor or microprocessor that interprets and executes instructions. Main memory 230 may include a random access memory (RAM) or another type of dynamic storage device that stores information and instructions for execution by processor 220. ROM 240 may include a conventional ROM device or another type of static storage device that stores static information and instructions for use by processor 220. Storage device 250 may include a magnetic and/or optical recording medium and its corresponding drive.
Input device(s) 260 may include one or more conventional mechanisms that permit an operator to input information to the client/server entity, such as a keyboard, a mouse, a pen, voice recognition and/or biometric mechanisms, etc. Output device(s) 270 may include one or more conventional mechanisms that output information to the operator, including a display, a printer, a speaker, etc. Communication interface 280 may include any transceiver-like mechanism that enables the client/server entity to communicate with other devices and/or systems. For example, communication interface 280 may include mechanisms for communicating with another device or system via a network, such as network 150.
As will be described in detail below, the client/server entity, consistent with the principles of the invention, perform certain searching-related operations. The client/server entity may perform these operations in response to processor 220 executing software instructions contained in a computer-readable medium, such as memory 230. A computer-readable medium may be defined as one or more physical or logical memory devices and/or carrier waves.
The software instructions may be read into memory 230 from another computer-readable medium, such as data storage device 250, or from another device via communication interface 280. The software instructions contained in memory 230 causes processor 220 to perform processes that will be described later. Alternatively, hardwired circuitry may be used in place of or in combination with software instructions to implement processes consistent with the principles of the invention. Thus, implementations consistent with the principles of the invention are not limited to any specific combination of hardware circuitry and software.
Exemplary Processing
FIGS. 3A and 3B are flowcharts of exemplary processing for automatically completing fragments of text, such as sentences and paragraphs, according to an implementation consistent with the principles of the invention. Processing may begin with server 120 receiving a search query from a user (act 310) (FIG. 3A). For example, a user may use conventional web browser software on client 110 to access search engine 125 of server 120. The user may then enter the search query via a graphical user interface provided by server 120.
The search query may take different forms, such as a fragment of text. The text fragment may be associated with a partial sentence, such as “Jane, I have to go because.” Alternatively, the text fragment may be associated with a partial paragraph, such as “Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived, and so dedicated, can long endure. We are met on a great battle field of that war.” While the description to follow will be described mainly in terms of completing sentences, the description is equally applicable to completing paragraphs.
Server 120 may perform a search for documents that contain the search query and retrieve the search results (act 320). For example, server 120 may search a corpus or repository of documents to identify documents that include the text fragment of the search query as a phrase. In another implementation, server 120 may search for documents that also include synonyms of the word(s) in the search query. In either case, the documents may include documents stored by one or more servers, such as servers 120-140. Server 120 may optionally cap the number of documents included in the search results (e.g., server 120 may retrieve the top 100 documents). For each of these documents, server 120 may retrieve its title and text.
Server 120 may then determine whether there are sufficient search results (act 330). For example, server 120 may compare the number of search results retrieved with a threshold (e.g., five). When the number of search results is less than the threshold, the search results may not be adequate to satisfy the search query provided by the user. In this case, server 120 may form a shortened search query (act 340). For example, server 120 may drop one or more words from the search query.
Several techniques exist for determining what word(s) to drop. For example according to one implementation, server 120 may simply drop one or more words from the beginning or end of the search query. According to another implementation, server 120 may drop one or more words based on one or more symbols, such as a comma, semicolon, bracket, backslash, etc., contained in the search query. For example, if the search query includes a comma, then server 120 may drop everything before or after the comma. Server 120 may perform similar functions based on other symbols. According to yet another implementation, server 120 may analyze the structure of the search query to more intelligently drop one or more words. For example, server 120 may use a parse tree to identify parts of the search query. Server 120 may then drop one or more of these parts. In the sentence example provided above, server 120 may shorten the search query to “I have to go because,” dropping “Jane,” from the search query.
Server 120 may then perform a search for documents that contain the shortened search query and retrieve the search results (act 320). As described above, server 120 may search a corpus or repository of documents to identify documents that include the shortened search query as a phrase. Server 120 may then again determine whether there are sufficient search results (act 330).
When there are sufficient search results (e.g., the number of search results is greater than or equal to the threshold), server 120 may scan the text of the documents in the search results to identify sentences that contain the search query (act 350). Server 120 may optionally locate periods within the documents to identify candidate sentences and then identify which of the candidate sentences include the search query. The search query may be included at the beginning or elsewhere within the identified sentences. Server 120 may give preference to a sentence that includes the search query at the beginning of the sentence over sentences where the search query occurs elsewhere. Server 120 may optionally discard sentences where the search query occurs more than once within the same sentences.
For each occurrence of the search query, server 120 may search left and right to determine the rough boundaries of the sentence containing the search query. For example, server 120 may look for periods (or other forms of punctuation) that typically precede and end a sentence. Server 120 may be programmed to ignore other typical occurrences of periods (and other forms of punctuation), such as when periods are used for initials, abbreviations, etc. Server 120 may optionally discard sentences that are missing punctuation and sentences that do not make sense (e.g., do not contain proper sentence structure).
Server 120 may then determine the sentence endings (also called “completions”) associated with the identified sentences (act 360) (FIG. 3B). For example, server 120 may identify the word(s) that follow the text fragment of the search query until the end of the sentence. Server 120 may define a quality sentence ending as one that “ends properly,” where “ends properly” is defined as: (1) the word(s) at the end make a better end of a sentence than they do a beginning of a sentence (e.g., year and pen); and (2) the last word is not in a list of bad endings (which may be maintained by server 120) (e.g., vs, dr, and aug).
To help in determining whether a word makes a better end of a sentence than a beginning of a sentence, a set of inverse document frequency (IDF) tables may be generated. IDF refers to a measure of a word's importance. In this case, two IDF tables may be generated. One table (hereinafter referred to as “start IDF table”) may include uni-grams and bi-grams that are common at the start of sentences. The other table (hereinafter referred to as “end IDF table”) may include uni-grams and bi-grams that are common at the end of sentences. To determine what is “common,” a corpus of documents may be analyzed to identify the text that occurs around a period. Whether a word makes a better end of a sentence may be determined by analyzing the start and end IDF tables.
Server 120 may optionally trim and/or merge the sentence endings (act 370). When determining whether to trim a sentence ending, server 120 may consider the text and symbols included in the sentence ending. For example, server 120 may compare text of the sentence ending to entries in the start and end IDF tables to determine whether to cut the text. Server 120 may also consider symbols, such as a comma, semicolon, bracket, backslash, etc., when identifying what text to cut. In one implementation, server 120 may treat the dash separately, considering the text until the dash as a substring and ignoring the text after the dash. Server 120 may also disregard entire sentence endings that contain a colon (to avoid noise from message postings). Single word sentence endings may be considered when the word is significant (e.g., it is a common ending in the end IDF table). Based on the foregoing, server 120 may further consider a sentence ending that: (1) ends properly; and (2) does not separate a preposition (or possessive) from its object.
When determining whether to merge sentence endings, server 120 may search for sentence endings that overlap (i.e., sentence endings that have one or more words in common). Sentence endings may be merged based on their common parts. When merging sentence endings, server 120 may permit some small differences between them. For example, the sentence endings “has four legs and has a tail and barks” and “has four legs and a tail” may be merged to “has four legs and a tail.”
Server 120 may optionally score the sentence endings (act 380). For example, server 120 may score the sentence endings by popularity. In other words, sentence endings that occur more often in the documents retrieved by the search may be scored higher than sentence endings that do not occur as often. Server 120 may alternatively, or additionally, score the sentence endings based on where the text fragment of the search query occurs within the identified sentences. In other words, the sentence endings corresponding to sentences where the text fragment of the search query occurs at the beginning of the sentences may be scored higher than sentence endings corresponding to sentences where the text fragment occurs elsewhere within the sentences. Server 120 may also penalize sentence endings for being too long, decreasing their scores. Server 120 may separately consider all of the sentence endings that were used to create a merged sentence ending when determining the score of that sentence ending.
Server 120 may present the sentence endings to the user (act 390). If the sentence endings were scored in some manner, server 120 may organize the sentence endings into a ranked list that it may provide to the user. In one implementation, server 120 may present an initial group of sentence endings to the user. The user may then be permitted to cycle through subsequent groups in a conventional manner.
FIG. 4 is a diagram of an exemplary ranked list 400 according to an implementation consistent with the principles of the invention. The exemplary ranked list 400 may include ranked items that each include a score 410 and a sentence ending (or “completion”) 420. In this example, the user has provide a partial sentence of “I need to go now because.” Server 120 provided various sentence endings that complete the partial sentence. In this example, the top-ranked sentence ending is “I have to get up early tomorrow.”
In another implementation consistent with the principles of the invention, server 120 may provide sentence endings via a different interface. For example, server 120 may operate in conjunction with an application, such as a word processing application, an instant messenger application, an e-mail application, or another type of application via which documents (including messages) are prepared or edited. In any case, a server assistant, which may be in the form of executable code, such as a plug-in, an applet, a dynamic link library (DLL), or a similar type of executable object or process, resident on client 110, may operate to obtain the sentence endings from server 120. For example, the server assistant may notice text fragments that may require completion and communicate with server 120 to obtain the sentence endings. The server assistant may “notice” the text fragments by detecting them automatically to obtain the sentence endings on-the-fly or by detecting them when instructed by the user.
According to one implementation, the server assistant may automatically insert one of the sentence endings at the location of the user's cursor. For example, if the user types “I need to go because” and presses a special key, the server assistant may complete the sentence by automatically inserting one of the sentence endings. The user may then be permitted to view other possible sentence endings by pressing the special key again. Alternatively, subsequent sentence endings may be automatically presented after expiration of a possibly user-configurable amount of time. According to another implementation, the server assistant may present the sentence endings via a pop-up window, another type of interface, or a combination of interfaces (e.g., a first possible sentence ending may be automatically inserted, but subsequent sentence endings may be presented via a pop-up window).
CONCLUSION
Systems and methods consistent with the principles of the invention may automatically complete a fragment of text, such as a sentence or paragraph. The systems and methods may identify possible endings from text in web documents.
The foregoing description of preferred embodiments of the present invention provides illustration and description, but is not intended to be exhaustive or to limit the invention to the precise form disclosed. Modifications and variations are possible in light of the above teachings or may be acquired from practice of the invention. For example, while series of acts have been described with regard to FIGS. 3A and 3B, the order of the acts may be modified in other implementations consistent with the principles of the invention. Also, non-dependent acts may be performed in parallel. Further, while the acts of trimming and merging have been described as preceding the act of scoring, the scoring act may be performed prior to the trimming and/or merging acts.
Also, automatic paragraph completion has been described briefly. In one implementation, server 120 may provide a separate interface for paragraph completion. In another implementation, server 120 may provide the same interface for sentence and paragraph completion. When searching for paragraph endings, server 120 may also look for synonyms of the words provided in the search query. Server 120 may provide paragraph endings separately from or along with sentence endings. For example, server 120 may score the paragraph endings and the sentence endings and rank them based on their scores. It may be possible for server 120 to provide paragraph endings instead of sentence endings when server 120 finds no (or very few) good sentence endings for the search query.
Further, it has generally been described that server 120 performs most, if not all, of the acts described with regard to the processing of FIGS. 3A and 3B. In another implementation consistent with the principles of the invention, one or more, or all, of the acts may be performed by client 110. For example, client 110 may obtain a text fragment and search documents local to client 110 (e.g., documents stored by client 110 and/or documents stored by a database accessible by client 110) to identify one or more documents that contain the text fragment. From these documents, client 110 may then identify potential sentence completions for the text fragment.

Claims (30)

1. A method performed by one or more server or client devices, comprising:
obtaining, using a processor associated with the one or more server or client devices, a text fragment;
performing, using a processor associated with the one or more server or client devices, a search, based, at least in part, on the text fragment, to identify one or more documents;
identifying, using a processor associated with the one or more server or client devices, sentences within the one or more documents that include the text fragment;
determining, using a processor associated with the one or more server or client devices, sentence endings as text that is located within the identified sentences between the text fragment and an end of the identified sentences;
assigning, using a processor associated with the one or more server or client devices, scores to the sentence endings based, at least in part, on a location within the identified sentences at which the text fragment occurs; and
outputting, using a processor associated with the one or more server or client devices, the sentence endings as potential completions for the text fragment based, at least in part, on the scores.
2. The method of claim 1, where the text fragment includes a phrase.
3. The method of claim 1, where the obtaining a text fragment includes receiving the text fragment from a user.
4. The method of claim 1, where the obtaining a text fragment includes automatically detecting the text fragment.
5. The method of claim 1, where the performing a search includes searching for documents that include the text fragment as a phrase.
6. The method of claim 1, where the performing a search includes searching for documents that include the text fragment and synonyms of one or more words within the text fragment.
7. The method of claim 1, where the identifying sentences within the one or more documents includes determining boundaries of the identified sentences based, at least in part, on punctuation that borders the identified sentences in the one or more documents.
8. The method of claim 1, further comprising:
trimming at least one of the sentence endings by dropping one or more words from the at least one sentence ending.
9. The method of claim 8, where the one or more words are dropped from the at least one sentence ending based, at least in part, on at least one of text or one or more symbols included in the at least one sentence ending.
10. The method of claim 9, further comprising:
generating an inverse document frequency table that includes words common to sentence endings; and
where the trimming at least one of the sentence endings includes:
comparing the text of the at least one sentence ending to words in the inverse document frequency table, and
dropping one or more words from the at least one sentence ending based, at least in part, on a result of the comparison.
11. The method of claim 9, where the trimming at least one of the sentence endings includes:
identifying the one or more symbols included in the at least one sentence ending, and
dropping one or more words from the at least one sentence ending based, at least in part, on the one or more identified symbols.
12. The method of claim 1, further comprising:
merging two or more of the sentence endings into a merged sentence ending.
13. The method of claim 12, where the merging two or more of the sentence endings includes:
identifying two or more of the sentence endings that have text in common, and
merging the identified two or more sentence endings.
14. The method of claim 1, further comprising:
determining quality ones of the sentence endings based, at least in part, on at least one of a table of common beginnings of sentences or a table of common endings of sentences.
15. The method of claim 1, where assigning the scores to the sentence endings is further based, at least in part, on a measure of popularity associated with each of the sentence endings.
16. The method of claim 15, where the measure of popularity associated with the sentence endings is based, at least in part, on a number of times that the sentence endings occur within the one or more documents.
17. The method of claim 1, further comprising:
adjusting the scores of the sentence endings based, at least in part, on lengths of the sentence endings.
18. The method of claim 1, further comprising:
adjusting the scores of the sentence endings based, at least in part, on whether at least a portion of the sentence endings are included in a list of bad endings.
19. The method of claim 1, where the outputting the sentence endings includes:
ordering the sentence endings based, at least in part, on the scores, and
presenting the ordered sentence endings as potential completions for the text fragment.
20. The method of claim 1, where the outputting the sentence endings includes:
inserting one of the sentence endings near a location of the text fragment, and
replacing the one of the sentence endings with a subsequent one or more of the sentence endings.
21. A system, comprising:
one or more devices comprising:
means for receiving a text fragment;
means for identifying documents that include the text fragment;
means for locating sentences within the documents that include the text fragment;
means for identifying sentence endings, associated with the located sentences, as text that is located within the located sentences between the text fragment and an end of the located sentences;
means for assigning scores to the sentence endings based, at least in part, on a measure of popularity associated with the sentence endings and a location at which the text fragment occurs within the located sentences, where the measure of popularity associated with one of the sentence endings is based, at least in part, on a number of times that the one of the sentence endings occurs within the documents; and
means for presenting the sentence endings as potential completions for the text fragment based, at least in part, on the scores.
22. The system of claim 21, further comprising:
means for determining whether a quantity of the documents is less than a threshold;
means for shortening the text fragment when the quantity of the documents is less than the threshold; and
means for performing a search, based, at least in part, on the shortened text fragment, to identify a set of documents.
23. The method system of claim 22, where the means for shortening the text fragment includes means for dropping one or more words from a beginning or end of the text fragment.
24. The method system of claim 22, where the means for shortening the text fragment includes:
means for identifying one or more symbols within the text fragment, and
means for dropping one or more words from the text fragment based, at least in part, on the one or more identified symbols.
25. The method system of claim 22, where the means for shortening the text fragment includes:
means for analyzing a structure of the text fragment, and
means for dropping one or more words from the text fragment based, at least in part, on the analysis of the structure of the text fragment.
26. A system, comprising:
one or more servers to:
receive a text fragment, where the text fragment includes a plurality of words, identify documents that include the text fragment,
locate sentences within the documents that include the text fragment,
determine sentence completions, associated with the located sentences, as text that is located within the located sentences between the text fragment and an end of the located sentences,
trim one of the sentence completions by dropping one or more words from the one of the sentence completions, assign scores to the sentence completions based, at least in part, on a measure of popularity associated with the sentence completions and a location within the located sentences at which the text fragment occurs, and
provide a plurality of the sentence completions including the trimmed sentence completion as potential completions for the text fragment based, at least in part, on the scores.
27. The system of claim 26, where the one or more servers are further to:
discard one or more of the sentence completions when at least a portion of the one or more sentence completions is included in a list of bad completions.
28. The system of claim 26, where the one or more servers include a plurality of servers.
29. A computer device, comprising:
a memory to store instructions; and
a processor to execute the instructions in the memory to:
obtain a fragment of text,
search for documents that include the fragment of text,
identify sentences within the documents that include the fragment of text,
determine sentence completions as text located within the identified sentences between the fragment of text and an end of the identified sentences,
merge at least two of the sentence completions to form a single merged sentence completion, assign scores to the sentence completions based, at least in part, on a measure of popularity associated with the sentence completions and a location within the identified sentences at which the fragment of text occurs, and
provide a plurality of the sentence completions, including the merged sentence completion, as potential completions for the fragment of text based, at least in part, on the scores.
30. The device of claim 29, where, when providing the plurality of sentence completions, the processor is to:
present the plurality of sentence completions via a pop-up window.
US10/697,333 2003-10-31 2003-10-31 Automatic completion of fragments of text Active 2028-12-02 US7657423B1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US10/697,333 US7657423B1 (en) 2003-10-31 2003-10-31 Automatic completion of fragments of text
US12/636,926 US8024178B1 (en) 2003-10-31 2009-12-14 Automatic completion of fragments of text
US13/235,025 US8280722B1 (en) 2003-10-31 2011-09-16 Automatic completion of fragments of text
US13/598,089 US8521515B1 (en) 2003-10-31 2012-08-29 Automatic completion of fragments of text

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/697,333 US7657423B1 (en) 2003-10-31 2003-10-31 Automatic completion of fragments of text

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/636,926 Continuation US8024178B1 (en) 2003-10-31 2009-12-14 Automatic completion of fragments of text

Publications (1)

Publication Number Publication Date
US7657423B1 true US7657423B1 (en) 2010-02-02

Family

ID=41581388

Family Applications (4)

Application Number Title Priority Date Filing Date
US10/697,333 Active 2028-12-02 US7657423B1 (en) 2003-10-31 2003-10-31 Automatic completion of fragments of text
US12/636,926 Expired - Lifetime US8024178B1 (en) 2003-10-31 2009-12-14 Automatic completion of fragments of text
US13/235,025 Expired - Lifetime US8280722B1 (en) 2003-10-31 2011-09-16 Automatic completion of fragments of text
US13/598,089 Expired - Lifetime US8521515B1 (en) 2003-10-31 2012-08-29 Automatic completion of fragments of text

Family Applications After (3)

Application Number Title Priority Date Filing Date
US12/636,926 Expired - Lifetime US8024178B1 (en) 2003-10-31 2009-12-14 Automatic completion of fragments of text
US13/235,025 Expired - Lifetime US8280722B1 (en) 2003-10-31 2011-09-16 Automatic completion of fragments of text
US13/598,089 Expired - Lifetime US8521515B1 (en) 2003-10-31 2012-08-29 Automatic completion of fragments of text

Country Status (1)

Country Link
US (4) US7657423B1 (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060217953A1 (en) * 2005-01-21 2006-09-28 Prashant Parikh Automatic dynamic contextual data entry completion system
US20080195571A1 (en) * 2007-02-08 2008-08-14 Microsoft Corporation Predicting textual candidates
US20080195388A1 (en) * 2007-02-08 2008-08-14 Microsoft Corporation Context based word prediction
US20090006543A1 (en) * 2001-08-20 2009-01-01 Masterobjects System and method for asynchronous retrieval of information based on incremental user input
US20090249203A1 (en) * 2006-07-20 2009-10-01 Akira Tsuruta User interface device, computer program, and its recording medium
US20100199176A1 (en) * 2009-02-02 2010-08-05 Chronqvist Fredrik A Electronic device with text prediction function and method
US20110083079A1 (en) * 2009-10-02 2011-04-07 International Business Machines Corporation Apparatus, system, and method for improved type-ahead functionality in a type-ahead field based on activity of a user within a user interface
US20110126092A1 (en) * 2009-11-21 2011-05-26 Harris Technology, Llc Smart Paste
US20120072404A1 (en) * 2010-09-20 2012-03-22 Microsoft Corporation Dictionary service
US20120130707A1 (en) * 2008-01-02 2012-05-24 Jan Zygmunt Linguistic Assistance Systems And Methods
US8209323B2 (en) * 2006-07-18 2012-06-26 Cisco Technology, Inc. Methods and apparatuses for dynamically searching for electronic mail messages
US8280722B1 (en) 2003-10-31 2012-10-02 Google Inc. Automatic completion of fragments of text
US20140019117A1 (en) * 2012-07-12 2014-01-16 Yahoo! Inc. Response completion in social media
US8688698B1 (en) * 2011-02-11 2014-04-01 Google Inc. Automatic text suggestion
US8930181B2 (en) 2012-12-06 2015-01-06 Prashant Parikh Automatic dynamic contextual data entry completion
WO2015071804A1 (en) * 2013-11-13 2015-05-21 International Business Machines Corporation Ranking prediction candidates of controlled natural languages or business rules depending on document hierarchy
US9043198B1 (en) 2012-04-13 2015-05-26 Google Inc. Text suggestion
GB2544149A (en) * 2015-08-19 2017-05-10 Hand Held Prod Inc Auto-complete methods for spoken complete value entries
US20180101599A1 (en) * 2016-10-08 2018-04-12 Microsoft Technology Licensing, Llc Interactive context-based text completions
EP3495928A4 (en) * 2016-08-03 2019-07-24 Tencent Technology (Shenzhen) Company Limited Candidate input determination method, input suggestion method, and electronic apparatus
US10410629B2 (en) 2015-08-19 2019-09-10 Hand Held Products, Inc. Auto-complete methods for spoken complete value entries
US11704493B2 (en) 2020-01-15 2023-07-18 Kyndryl, Inc. Neural parser for snippets of dynamic virtual assistant conversation

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7219098B2 (en) 2002-01-14 2007-05-15 International Business Machines Corporation System and method for processing data in a distributed architecture
US9020806B2 (en) * 2012-11-30 2015-04-28 Microsoft Technology Licensing, Llc Generating sentence completion questions
US10372808B1 (en) 2012-12-12 2019-08-06 Google Llc Passing functional spreadsheet data by reference
US9959265B1 (en) 2014-05-08 2018-05-01 Google Llc Populating values in a spreadsheet using semantic cues
CN108075959B (en) * 2016-11-14 2021-03-12 腾讯科技(深圳)有限公司 Session message processing method and device
US11550751B2 (en) 2016-11-18 2023-01-10 Microsoft Technology Licensing, Llc Sequence expander for data entry/information retrieval

Citations (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4994966A (en) * 1988-03-31 1991-02-19 Emerson & Stern Associates, Inc. System and method for natural language parsing by initiating processing prior to entry of complete sentences
US5369577A (en) * 1991-02-01 1994-11-29 Wang Laboratories, Inc. Text searching system
US5519608A (en) * 1993-06-24 1996-05-21 Xerox Corporation Method for extracting from a text corpus answers to questions stated in natural language by using linguistic analysis and hypothesis generation
US5678053A (en) * 1994-09-29 1997-10-14 Mitsubishi Electric Information Technology Center America, Inc. Grammar checker interface
US5757983A (en) * 1990-08-09 1998-05-26 Hitachi, Ltd. Document retrieval method and system
US5885083A (en) * 1996-04-09 1999-03-23 Raytheon Company System and method for multimodal interactive speech and language training
US5896321A (en) * 1997-11-14 1999-04-20 Microsoft Corporation Text completion system for a miniature computer
US5953541A (en) * 1997-01-24 1999-09-14 Tegic Communications, Inc. Disambiguating system for disambiguating ambiguous input sequences by displaying objects associated with the generated input sequences in the order of decreasing frequency of use
US5952942A (en) * 1996-11-21 1999-09-14 Motorola, Inc. Method and device for input of text messages from a keypad
US5956739A (en) * 1996-06-25 1999-09-21 Mitsubishi Electric Information Technology Center America, Inc. System for text correction adaptive to the text being corrected
US6173253B1 (en) * 1998-03-30 2001-01-09 Hitachi, Ltd. Sentence processing apparatus and method thereof,utilizing dictionaries to interpolate elliptic characters or symbols
US6173261B1 (en) * 1998-09-30 2001-01-09 At&T Corp Grammar fragment acquisition using syntactic and semantic clustering
US20010004737A1 (en) * 1999-12-14 2001-06-21 Sun Microsystems, Inc. System and method including a merging driver for accessing multiple data sources
US6374242B1 (en) * 1999-09-29 2002-04-16 Lockheed Martin Corporation Natural-language information processor with association searches limited within blocks
US6377945B1 (en) * 1998-07-10 2002-04-23 Fast Search & Transfer Asa Search system and method for retrieval of data, and the use thereof in a search engine
US20020174101A1 (en) * 2000-07-12 2002-11-21 Fernley Helen Elaine Penelope Document retrieval system
US20030023426A1 (en) * 2001-06-22 2003-01-30 Zi Technology Corporation Ltd. Japanese language entry mechanism for small keypads
US6564213B1 (en) * 2000-04-18 2003-05-13 Amazon.Com, Inc. Search query autocompletion
US6584470B2 (en) * 2001-03-01 2003-06-24 Intelliseek, Inc. Multi-layered semiotic mechanism for answering natural language questions using document retrieval combined with information extraction
US6618697B1 (en) * 1999-05-14 2003-09-09 Justsystem Corporation Method for rule-based correction of spelling and grammar errors
US20030232312A1 (en) * 2002-06-14 2003-12-18 Newsom C. Mckeller Method and system for instantly communicating, translating, and learning a secondary language
US20040078366A1 (en) * 2002-10-18 2004-04-22 Crooks Steven S. Automated order entry system and method
US20040117352A1 (en) * 2000-04-28 2004-06-17 Global Information Research And Technologies Llc System for answering natural language questions
US20040153975A1 (en) * 2003-02-05 2004-08-05 Williams Roland E. Text entry mechanism for small keypads
US6775677B1 (en) * 2000-03-02 2004-08-10 International Business Machines Corporation System, method, and program product for identifying and describing topics in a collection of electronic documents
US20040183833A1 (en) * 2003-03-19 2004-09-23 Chua Yong Tong Keyboard error reduction method and apparatus
US20040225647A1 (en) * 2003-05-09 2004-11-11 John Connelly Display system and method
US6820075B2 (en) * 2001-08-13 2004-11-16 Xerox Corporation Document-centric system with auto-completion
US20050223308A1 (en) * 1999-03-18 2005-10-06 602531 British Columbia Ltd. Data entry for personal computing devices
US6957213B1 (en) * 2000-05-17 2005-10-18 Inquira, Inc. Method of utilizing implicit references to answer a query
US6963869B2 (en) * 2002-01-07 2005-11-08 Hewlett-Packard Development Company, L.P. System and method for search, index, parsing document database including subject document having nested fields associated start and end meta words where each meta word identify location and nesting level
US7027975B1 (en) * 2000-08-08 2006-04-11 Object Services And Consulting, Inc. Guided natural language interface system and method
US7149550B2 (en) * 2001-11-27 2006-12-12 Nokia Corporation Communication terminal having a text editor application with a word completion feature
US7200592B2 (en) * 2002-01-14 2007-04-03 International Business Machines Corporation System for synchronizing of user's affinity to knowledge
US20070150469A1 (en) * 2005-12-19 2007-06-28 Charles Simonyi Multi-segment string search
US7376641B2 (en) * 2000-05-02 2008-05-20 International Business Machines Corporation Information retrieval from a collection of data

Family Cites Families (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6169538B1 (en) * 1998-08-13 2001-01-02 Motorola, Inc. Method and apparatus for implementing a graphical user interface keyboard and a text buffer on electronic devices
US6549897B1 (en) * 1998-10-09 2003-04-15 Microsoft Corporation Method and system for calculating phrase-document importance
US6480843B2 (en) * 1998-11-03 2002-11-12 Nec Usa, Inc. Supporting web-query expansion efficiently using multi-granularity indexing and query processing
US6646573B1 (en) * 1998-12-04 2003-11-11 America Online, Inc. Reduced keyboard text input system for the Japanese language
US6204848B1 (en) * 1999-04-14 2001-03-20 Motorola, Inc. Data entry apparatus having a limited number of character keys and method
US20020087408A1 (en) * 1999-06-25 2002-07-04 Burnett Jonathan Robert System for providing information to intending consumers
US6529864B1 (en) * 1999-08-11 2003-03-04 Roedy-Black Publishing, Inc. Interactive connotative dictionary system
JP3918374B2 (en) * 1999-09-10 2007-05-23 富士ゼロックス株式会社 Document retrieval apparatus and method
US6587848B1 (en) * 2000-03-08 2003-07-01 International Business Machines Corporation Methods and apparatus for performing an affinity based similarity search
US6859800B1 (en) * 2000-04-26 2005-02-22 Global Information Research And Technologies Llc System for fulfilling an information need
US7017114B2 (en) * 2000-09-20 2006-03-21 International Business Machines Corporation Automatic correlation method for generating summaries for text documents
US6782384B2 (en) * 2000-10-04 2004-08-24 Idiom Merger Sub, Inc. Method of and system for splitting and/or merging content to facilitate content processing
JP2003030224A (en) * 2001-07-17 2003-01-31 Fujitsu Ltd Device for preparing document cluster, system for retrieving document and system for preparing faq
US20050022114A1 (en) * 2001-08-13 2005-01-27 Xerox Corporation Meta-document management system with personality identifiers
US7340534B2 (en) * 2002-03-05 2008-03-04 Sun Microsystems, Inc. Synchronization of documents between a server and small devices
US7478170B2 (en) * 2002-03-05 2009-01-13 Sun Microsystems, Inc. Generic infrastructure for converting documents between formats with merge capabilities
US7370035B2 (en) * 2002-09-03 2008-05-06 Idealab Methods and systems for search indexing
US7155427B1 (en) * 2002-10-30 2006-12-26 Oracle International Corporation Configurable search tool for finding and scoring non-exact matches in a relational database
JP2004164036A (en) * 2002-11-08 2004-06-10 Hewlett Packard Co <Hp> Method for evaluating commonality of document
US7941762B1 (en) * 2003-02-14 2011-05-10 Shoretel, Inc. Display of real time information for selected possibilities
EP1627335A1 (en) * 2003-03-07 2006-02-22 Nokia Corporation A method and a device for frequency counting
US7129932B1 (en) * 2003-03-26 2006-10-31 At&T Corp. Keyboard for interacting on small devices
US7657423B1 (en) 2003-10-31 2010-02-02 Google Inc. Automatic completion of fragments of text

Patent Citations (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4994966A (en) * 1988-03-31 1991-02-19 Emerson & Stern Associates, Inc. System and method for natural language parsing by initiating processing prior to entry of complete sentences
US5757983A (en) * 1990-08-09 1998-05-26 Hitachi, Ltd. Document retrieval method and system
US5369577A (en) * 1991-02-01 1994-11-29 Wang Laboratories, Inc. Text searching system
US5519608A (en) * 1993-06-24 1996-05-21 Xerox Corporation Method for extracting from a text corpus answers to questions stated in natural language by using linguistic analysis and hypothesis generation
US5678053A (en) * 1994-09-29 1997-10-14 Mitsubishi Electric Information Technology Center America, Inc. Grammar checker interface
US5885083A (en) * 1996-04-09 1999-03-23 Raytheon Company System and method for multimodal interactive speech and language training
US5956739A (en) * 1996-06-25 1999-09-21 Mitsubishi Electric Information Technology Center America, Inc. System for text correction adaptive to the text being corrected
US5952942A (en) * 1996-11-21 1999-09-14 Motorola, Inc. Method and device for input of text messages from a keypad
US5953541A (en) * 1997-01-24 1999-09-14 Tegic Communications, Inc. Disambiguating system for disambiguating ambiguous input sequences by displaying objects associated with the generated input sequences in the order of decreasing frequency of use
US5896321A (en) * 1997-11-14 1999-04-20 Microsoft Corporation Text completion system for a miniature computer
US6173253B1 (en) * 1998-03-30 2001-01-09 Hitachi, Ltd. Sentence processing apparatus and method thereof,utilizing dictionaries to interpolate elliptic characters or symbols
US6377945B1 (en) * 1998-07-10 2002-04-23 Fast Search & Transfer Asa Search system and method for retrieval of data, and the use thereof in a search engine
US6173261B1 (en) * 1998-09-30 2001-01-09 At&T Corp Grammar fragment acquisition using syntactic and semantic clustering
US20050223308A1 (en) * 1999-03-18 2005-10-06 602531 British Columbia Ltd. Data entry for personal computing devices
US6618697B1 (en) * 1999-05-14 2003-09-09 Justsystem Corporation Method for rule-based correction of spelling and grammar errors
US6374242B1 (en) * 1999-09-29 2002-04-16 Lockheed Martin Corporation Natural-language information processor with association searches limited within blocks
US20010004737A1 (en) * 1999-12-14 2001-06-21 Sun Microsystems, Inc. System and method including a merging driver for accessing multiple data sources
US6775677B1 (en) * 2000-03-02 2004-08-10 International Business Machines Corporation System, method, and program product for identifying and describing topics in a collection of electronic documents
US6564213B1 (en) * 2000-04-18 2003-05-13 Amazon.Com, Inc. Search query autocompletion
US20040117352A1 (en) * 2000-04-28 2004-06-17 Global Information Research And Technologies Llc System for answering natural language questions
US7376641B2 (en) * 2000-05-02 2008-05-20 International Business Machines Corporation Information retrieval from a collection of data
US6957213B1 (en) * 2000-05-17 2005-10-18 Inquira, Inc. Method of utilizing implicit references to answer a query
US20020174101A1 (en) * 2000-07-12 2002-11-21 Fernley Helen Elaine Penelope Document retrieval system
US7027975B1 (en) * 2000-08-08 2006-04-11 Object Services And Consulting, Inc. Guided natural language interface system and method
US6584470B2 (en) * 2001-03-01 2003-06-24 Intelliseek, Inc. Multi-layered semiotic mechanism for answering natural language questions using document retrieval combined with information extraction
US20030023426A1 (en) * 2001-06-22 2003-01-30 Zi Technology Corporation Ltd. Japanese language entry mechanism for small keypads
US6820075B2 (en) * 2001-08-13 2004-11-16 Xerox Corporation Document-centric system with auto-completion
US7149550B2 (en) * 2001-11-27 2006-12-12 Nokia Corporation Communication terminal having a text editor application with a word completion feature
US6963869B2 (en) * 2002-01-07 2005-11-08 Hewlett-Packard Development Company, L.P. System and method for search, index, parsing document database including subject document having nested fields associated start and end meta words where each meta word identify location and nesting level
US7200592B2 (en) * 2002-01-14 2007-04-03 International Business Machines Corporation System for synchronizing of user's affinity to knowledge
US20030232312A1 (en) * 2002-06-14 2003-12-18 Newsom C. Mckeller Method and system for instantly communicating, translating, and learning a secondary language
US20040078366A1 (en) * 2002-10-18 2004-04-22 Crooks Steven S. Automated order entry system and method
US20040153975A1 (en) * 2003-02-05 2004-08-05 Williams Roland E. Text entry mechanism for small keypads
US20040183833A1 (en) * 2003-03-19 2004-09-23 Chua Yong Tong Keyboard error reduction method and apparatus
US20040225647A1 (en) * 2003-05-09 2004-11-11 John Connelly Display system and method
US20070150469A1 (en) * 2005-12-19 2007-06-28 Charles Simonyi Multi-segment string search

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"Emacs Text Editor"; http://www.mrs.umn.edu/cs/unix/emacs.html; Apr. 5, 2002; pp. 1-6.
"Googlism"; http://www.googlism.com/about.htm; Oct. 9, 2003 (print date); 1 page.
"Microsoft Word AutoComplete & Auto Text Features"; http://computing.fandm.edu/training/wordx/autotext.php; Oct. 9, 2003; pp. 1-5.

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090006543A1 (en) * 2001-08-20 2009-01-01 Masterobjects System and method for asynchronous retrieval of information based on incremental user input
US8280722B1 (en) 2003-10-31 2012-10-02 Google Inc. Automatic completion of fragments of text
US8521515B1 (en) 2003-10-31 2013-08-27 Google Inc. Automatic completion of fragments of text
US20060217953A1 (en) * 2005-01-21 2006-09-28 Prashant Parikh Automatic dynamic contextual data entry completion system
US20100070855A1 (en) * 2005-01-21 2010-03-18 Prashant Parikh Automatic dynamic contextual data entry completion system
US7991784B2 (en) * 2005-01-21 2011-08-02 Prashant Parikh Automatic dynamic contextual data entry completion system
US8311805B2 (en) * 2005-01-21 2012-11-13 Prashant Parikh Automatic dynamic contextual data entry completion system
US8209323B2 (en) * 2006-07-18 2012-06-26 Cisco Technology, Inc. Methods and apparatuses for dynamically searching for electronic mail messages
US20090249203A1 (en) * 2006-07-20 2009-10-01 Akira Tsuruta User interface device, computer program, and its recording medium
US7912700B2 (en) * 2007-02-08 2011-03-22 Microsoft Corporation Context based word prediction
US7809719B2 (en) 2007-02-08 2010-10-05 Microsoft Corporation Predicting textual candidates
US20080195388A1 (en) * 2007-02-08 2008-08-14 Microsoft Corporation Context based word prediction
US20080195571A1 (en) * 2007-02-08 2008-08-14 Microsoft Corporation Predicting textual candidates
US8423526B2 (en) * 2008-01-02 2013-04-16 Thinkvillage-Oip, Llc Linguistic assistance systems and methods
US20120130707A1 (en) * 2008-01-02 2012-05-24 Jan Zygmunt Linguistic Assistance Systems And Methods
US20100199176A1 (en) * 2009-02-02 2010-08-05 Chronqvist Fredrik A Electronic device with text prediction function and method
US20110083079A1 (en) * 2009-10-02 2011-04-07 International Business Machines Corporation Apparatus, system, and method for improved type-ahead functionality in a type-ahead field based on activity of a user within a user interface
US20110126092A1 (en) * 2009-11-21 2011-05-26 Harris Technology, Llc Smart Paste
US20120072404A1 (en) * 2010-09-20 2012-03-22 Microsoft Corporation Dictionary service
US9213704B2 (en) * 2010-09-20 2015-12-15 Microsoft Technology Licensing, Llc Dictionary service
US8688698B1 (en) * 2011-02-11 2014-04-01 Google Inc. Automatic text suggestion
US9218338B1 (en) 2012-04-13 2015-12-22 Google Inc. Text suggestion
US9043198B1 (en) 2012-04-13 2015-05-26 Google Inc. Text suggestion
US20140019117A1 (en) * 2012-07-12 2014-01-16 Yahoo! Inc. Response completion in social media
US9380009B2 (en) * 2012-07-12 2016-06-28 Yahoo! Inc. Response completion in social media
US8930181B2 (en) 2012-12-06 2015-01-06 Prashant Parikh Automatic dynamic contextual data entry completion
WO2015071804A1 (en) * 2013-11-13 2015-05-21 International Business Machines Corporation Ranking prediction candidates of controlled natural languages or business rules depending on document hierarchy
GB2544149A (en) * 2015-08-19 2017-05-10 Hand Held Prod Inc Auto-complete methods for spoken complete value entries
US10410629B2 (en) 2015-08-19 2019-09-10 Hand Held Products, Inc. Auto-complete methods for spoken complete value entries
GB2573631A (en) * 2015-08-19 2019-11-13 Hand Held Prod Inc Auto-complete methods for spoken complete value entries
US10529335B2 (en) 2015-08-19 2020-01-07 Hand Held Products, Inc. Auto-complete methods for spoken complete value entries
GB2573631B (en) * 2015-08-19 2020-04-08 Hand Held Prod Inc Auto-complete methods for spoken complete value entries
EP3495928A4 (en) * 2016-08-03 2019-07-24 Tencent Technology (Shenzhen) Company Limited Candidate input determination method, input suggestion method, and electronic apparatus
US11050685B2 (en) 2016-08-03 2021-06-29 Tencent Technology (Shenzhen) Company Limited Method for determining candidate input, input prompting method and electronic device
US20180101599A1 (en) * 2016-10-08 2018-04-12 Microsoft Technology Licensing, Llc Interactive context-based text completions
US11704493B2 (en) 2020-01-15 2023-07-18 Kyndryl, Inc. Neural parser for snippets of dynamic virtual assistant conversation

Also Published As

Publication number Publication date
US8280722B1 (en) 2012-10-02
US8024178B1 (en) 2011-09-20
US8521515B1 (en) 2013-08-27

Similar Documents

Publication Publication Date Title
US8521515B1 (en) Automatic completion of fragments of text
US8805867B2 (en) Query rewriting with entity detection
US10528650B2 (en) User interface for presentation of a document
US6678694B1 (en) Indexed, extensible, interactive document retrieval system
US8527491B2 (en) Expanded text excerpts
US8554759B1 (en) Selection of documents to place in search index
US7260571B2 (en) Disambiguation of term occurrences
US8515952B2 (en) Systems and methods for determining document freshness
US8316007B2 (en) Automatically finding acronyms and synonyms in a corpus
US20060259475A1 (en) Database system and method for retrieving records from a record library
US20150172299A1 (en) Indexing and retrieval of blogs
US9031898B2 (en) Presentation of search results based on document structure
US9569504B1 (en) Deriving and using document and site quality signals from search query streams
US7451120B1 (en) Detecting novel document content
US8122022B1 (en) Abbreviation detection for common synonym generation
US20060122997A1 (en) System and method for text searching using weighted keywords
US8694499B1 (en) Systems and methods for determining query similarity by query distribution comparison
US20040098385A1 (en) Method for indentifying term importance to sample text using reference text
JP2015525929A (en) Weight-based stemming to improve search quality
US9183297B1 (en) Method and apparatus for generating lexical synonyms for query terms
WO2012143839A1 (en) A computerized system and a method for processing and building search strings
Saggion et al. Exploring the performance of boolean retrieval strategies for open domain question answering
US7970752B2 (en) Data processing system and method
JP2004206608A (en) Document retrieval method, its device, and its program
JP2013015967A (en) Retrieval system, index preparation apparatus, retrieval device, index preparation method, retrieval method, and program

Legal Events

Date Code Title Description
AS Assignment

Owner name: GOOGLE INC.,CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HARIK, GEORGES R.;TONG, SIMON;CHENG, DAVID R.;SIGNING DATES FROM 20031029 TO 20031030;REEL/FRAME:015813/0501

STCF Information on status: patent grant

Free format text: PATENTED CASE

CC Certificate of correction
FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

AS Assignment

Owner name: GOOGLE LLC, CALIFORNIA

Free format text: CHANGE OF NAME;ASSIGNOR:GOOGLE INC.;REEL/FRAME:044101/0610

Effective date: 20170929

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 12