The first section (API details) describes all the information a developer needs to integrate with the API. The next section (Common Use Cases) articulates various ways in which the API can be put to use. It is recommended to read this section before beginning implementation. Certain readers (such as product managers, for example) may find it more useful to begin with this section.
The two appendices dive in detail into two types of information returned by the API – correction types and correction categories respectively.
Sandbox: an endpoint for development and integration purposes. https://sb-partner-services.gingersoftware.com/correction/v1/document
Production: https://doc-partner-services.gingersoftware.com/correction/v1/document
Request: To correct a document, create a POST request with the document text in the body of the request. The document text is assumed to be url-encoded.
Sample request to the production endpoint:
https://doc-partner-services.gingersoftware.com/correction/v1/document?apiKey=someApiKey
- text/plain for plain text documents
- text/html for html documents
- application/plain: only type “text” is supported
- text/pdf: only subtypes “plain” and “html” are supported
URL params can be passed in any order
In case an illegal param value is passed, the server returns HTTP error 400 with an appropriate error message.
An unknown param name in the query string will be silently ignored.
Following is the list of supported url params and their values.
from ginger.
If you do not have an API Key yet, click here for your API key
Not mandatory. Defaults to “Indifferent”
The English language locale to be used for correction.
UK/US will enforce British English or American English spelling respectively. “Indifferent” will make the correction agnostic to locale, so either variation (e.g. colour or color) will be considered correct.
Not mandatory. Defaults to false.
If set to true, returns contextual synonym suggestions in addition to corrections.
For more details about recommendations, see the appendix.
Not mandatory. Defaults to false.
If set to true, returns style and vocabulary recommendations in addition to other corrections.
For more details about recommendations, see the appendix.
Not mandatory. Defaults to false.
If set to true, beginning of sentence capitalization will not be checked for.
For example, when the flag is set to true the sentence “the boy is tall” will be returned with no corrections (as opposed to suggesting to capitalize the word “the”.
Below is the response for sending the following two-sentence document for correction:
Her closes the door quiet. She not hear anything.
The response contains an array of three corrections with all their details as described below.
The response consists of two arrays: Corrections and Sentences.
Corrections array:Each element represents an error found in the document and its suggested corrections.
Confidence:A discrete value indicating how likely the correction is to be precise. Possible values: 4 (High), 3 (Medium), 2 (Low), 1 (None). The higher the value, the more reliable the correction.
ShouldReplaceA Boolean indicating whether it is recommended to automatically replace the error word with the top suggestion. For a review of various client implementation strategies, see the “Common use cases” section in this doc.
CorrectionType:A classification of the correction into one of the following high-level categories: 1 – Spelling, 2 – Misused Word, 3 – Grammar, 4 – Synonym, 5 – Recommendation, 6 - Punctuation
TopCategoryId:The id of the grammatical category the top suggestion belongs to. For a detailed description and a full list of category ids see the appendix to this document.
MistakeText:The word or words which contain an error.
MistakeDefinition:Whenever available, a dictionary definition of the mistake word
From, To:Zero based indices of the detected error, relative to the beginning of the document.
LrnFrg:Whenever available, the grammatical context of the error word/s. A short, not necessarily consecutive, fragment from the original sentence which gives grammatically intact context around the error word/s.
MistakeWordsInLrnFrg:The indices of the words with errors within the LrnFrg. The indices are zero based, relative to the beginning of the LrnFrg.
Suggestions Array:An array of suggested corrections for the error. The suggestions are ordered by relevance. Each suggestion contains the following data:
Text: the text with which to replace the error word/s in the original document.
CategoryId: The id of the grammatical category which the suggestion belongs to.
For a detailed description and a full list of category ids see the appendix to this document.
Definition: Whenever available, a dictionary definition of the suggested word.
Similar to the Corrections array. Contains vocabulary and style recommendations, as described in the Recommendations section of the “Contextual Synonyms and Recommendations” appendix. This array will appear only if there are recommendation corrections returned by the API.
Sentences array:An array in which each element represents a sentence in the original document. A client can use this section to determine how Ginger’s correction engine split the document into sentences, determine which sentences were not corrected and why, compute statistics like the number of sentences in the document and so on.
Each sentence in the array contains the following information:
FromIndex,ToIndex: zero based indices of the sentence, relative to the beginning of the document.
IsEnglish: False means the sentence was flagged as not being in English and thus was not corrected.
ExceededCharactersLimit: Sentences which exceed 300 characters are not corrected. Such sentences will have “true” in this field.
Response Example:
{ "GingerTheDocumentResult": { "Corrections": [ { "CorrectionType": 3, "From": 0, "Suggestions": [ { "CategoryId": 5, "Text": "She" } ], "To": 2, "TopCategoryId": 5 }, { "CorrectionType": 3, "From": 20, "Suggestions": [ { "CategoryId": 23, "Text": "quietly" } ], "To": 24, "TopCategoryId": 23 }, { "CorrectionType": 3, "From": 31, "Suggestions": [ { "CategoryId": 21, "Text": "doesn't hear" } ], "To": 38, "TopCategoryId": 21 } ], "Sentences": [ { "ExceededCharacterLimit": false, "FromIndex": 0, "IsEnglish": true, "ToIndex": 26 }, { "ExceededCharacterLimit": false, "FromIndex": 27, "IsEnglish": true, "ToIndex": 49 } ] } }
Returns a 200 response on success
A mandatory param was not passed. The Message property provides information about the missing parameter. In the example above it is the mandatory apiKey parameter.
{ "ErrorType": "Missing parameter", "Message": "APIKey is mandatory" }
{ "ErrorType": "Authentication error", "Message": "The api key is not recognized" }
{ "ErrorType": "Authentication error", "Message": "The api key is expired or disabled" }
A url param was passed with an illegal value. The Message property provides detailed information. In the example above it is the lang parameter, which was passed with an illegal value of “USS” (instead of (US) Example: generateSynonyms=falsee, avoidCapitalization=tru etc.
{ "ErrorType": "Illegal parameter value", "Message": "lang param value is invalid: uss" }
The document sent for correction exceeded the maximum allowed limit. The maximum document size is normally in the range of 10,000 and 50,000 characters.
{ "ErrorType": "Server-side error", "Message": "The document is too long" }
An unsupported document type was passed
{ "ErrorType": "Unsupported media-type in Content-Type header", "Message": "text/plaind is unsupported" }
Ginger’s correction API returns suggested corrections for words suspected as errors. It is up to the client of the API to decide how to apply the corrections to the text. The following section outlines common use cases. These can be broadly categorized as interactive and offline scenarios. Ginger’s API contains all the data required to serve both of these scenarios.
There is some overlap between the two, so it is recommended to read both sections below.
Interactive Corrections
In this scenario, a user submits their text for grammar and spelling correction through an online text editor. The following are some common practices for displaying the results of the API.
Highlighting errors
Highlighting or otherwise marking the parts of the text which were identified as errors can help the user focus on correcting them. This can be done using the “From” and “To” fields of each “Correction” object from the “Corrections” array in the response.
Displaying alternative suggestions
When an error is identified, three options exist with regard to how to correct it: there is either a single suggested correction, multiple suggested corrections or no suggested corrections.
Single suggested correction: In terms of the API response, this is the case when the “Suggestions” array of a certain “Correction” object contains a single element. The client application should decide how to display this suggestion to the user. This can be done either by replacing the error word automatically or by only highlighting it and letting the user view the suggestion and decide whether to apply it.
Implementations which choose to replace automatically should consider the “shouldReplace” property of the Correction object. If it is false, it is advised not to perform an automatic replacement.
Multiple suggested corrections: this case is similar to the previous one except that since there are several suggested replacements the user should be able to choose the one they would like to replace the error with. The items in the “Suggestions” array in the response are ordered according to their likelihood, so it is recommended that they are displayed to the user in the same order they are received in the response from the API.
Implementations which choose to replace automatically should thus do so using the first item in the Suggestions array and also consider the “shouldReplace” property of the Correction object as mentioned above.
No suggested corrections: this is a case when Ginger detected that there is a mistake in the text, but is not able to suggest any reasonable option for correction. Here is a simple example of such a case: “My hshfjedkskhd is new.” In such cases it is still helpful to highlight the mistake text, but it is advisable to make It clear to the user that no suggestions are available either by a different highlight color, explicitly stating so in the UI or both.
Displaying additional information about the correction
Implementations may choose to reflect several other parts of the API response to users:
Confidence: A user will likely benefit from knowing a certain correction is considered low confidence and thus be more careful when deciding whether to select one of the suggested alternatives. The Confidence field in the response contains this information. Values of 4 and 3 (High and Medium) can be considered very reliable. A value of 1 indicates low confidence. It is recommended to differentiate between the two in the user interface, e.g. by using a different highlight color in each case.
Correction Types: Corrections can be thought of as belonging to one of several groups or types: spelling mistakes, grammar mistakes, punctuation issues, word usage or vocabulary mistakes, synonyms and style recommendations. This information is contained in the CorrectionType field for each correction. An implementation can choose to use this field in various ways, such as:
- Help a user distinguish between different types of mistakes by using different colors for spelling, grammar and vocabulary errors
- Distinguish between mistakes and style recommendations and synonym suggestions
- Allow users to toggle the display of the different types of corrections on or off
Grammatical category: Each suggestion for correction is classified by the API as belonging to a certain grammatical category or topic. Knowing the category makes it possible to display to the user more fine-grained information about why the correction is being suggested. This both facilitates learning and increases the user’s confidence in the system.
For example, in the sentence “She live in my neighborhood”, the verb “live”” is corrected to “lives” so that it properly matches the subject of the sentence. Implementations can choose to explain to users at this point what subject verb agreement is, or the reason this suggested correction is offered to them. This can be done, for example, by presenting the explanation alongside the suggested correction, using a tooltip when hovering over the error word and so on.
This information is contained in the “CategoryId” field of each suggestion. To make things more convenient, the category of the first suggestion, which is the most likely one, is also given in the TopCategoryId property of the Correction object itself.
For a full list of categories, their description and examples, see the “Correction Categories” appendix in this document.
Definitions: Each suggestion often also has a dictionary definition of it in the Definition field of the Suggestion object. This info can be displayed to the user alongside the suggestion to enhance their understanding of the suggested word and thus make them more confident about their selection. Note that sometimes the definition will not appear, so the implementation must be ready to not display anything in this case.
Sentence boundaries
Ginger’s API response contains information about where each sentence in the document begins and ends. This information is contained in the FromIndex and ToIndex properties of each Sentence object in the Sentences section. This can be useful for user interfaces which wish to mark sentences which contain errors, for example, or similar needs. To know which sentence contains a specific error, an implementation needs to locate the sentence for which the From and To indices of the correction are contained within the FromIndex and ToIndex of the Sentence.
The document is split into sentences according to standard end of sentence punctuation: period, question mark, exclamation mark and combinations of several such punctuation marks. Common abbreviations ending with a period (such as Dr., Mrs. and the likes) are accounted for. Sentences that do not contain an end of sentence punctuation will be considered a single sentence, which may result in inferior correction quality or exceeding the maximum allowed size of a single sentence (300 chars).
Unproofed sentences
The API response contains information which allows application to conclude which sentences were not proofread and why. This can happen for one of several reasons, each designated by a specific property in the relevant Sentence object of the response.
- IsEnglish: false indicates the sentence is not in English and was thus not checked
- ExceededCharactersLimit: true means the sentence was not checked because it is longer than the maximum limit of 300 characters. Implementations may wish to flag these sentences in a specific way, prompt the user to split them and recheck and so on.
There are use cases for the correction API which do not involve an interactive user interface. One such common case is when a large number of documents is being sent for correction as part of some business flow. For example, a legal firm may wish to send, at the end of each day, all the documents produced during that day for spelling and grammar proofreading. Each morning, contracts flagged with more than a certain amount of errors or containing certain error types are passed for manual review. Another example, as part of its process of reviewing a new manuscript submitted to it, a book publisher or a scientific journal may wish to automatically proofread it and based on the results send it directly to an editor, pass it to a human proofreader or reject it altogether. A website may wish to perform a periodic review of all the new published content, user reviews and so on. Many other such use cases exist. The common thing about all these cases is that the response from the API is processed automatically by a machine whose output is either data or decisions to serve the next point in the pipeline, a human readable report, a new document with proofreading applied to it and so on.
Interactive and offline implementation alike may find it useful to create summary reports about a single document or set of documents. These may include both document information and proofreading information.
Document level information can include data such as number of sentences in each document, average sentence length, total document length, percent of non-English sentences and percent of too long sentences.
Proofreading information can show statistics about types of errors (Spelling vs. Grammar vs. Vocabulary), a breakup of the user’s mistake by grammatical topic, the overall number of mistakes, mistakes per sentence and the like.
Implementations may even want to consider tracking such statistics over time for profiling or progress monitoring purposes.
All of the information to generate the above statistics is contained in the various fields of the API response discussed in previous sections.
Example:
My coussin is very picky about the restaurants he dines in.Without any flags, only the spelling mistake coussin->cousin will be corrected.
With generateSynonyms=true, in addition to the spelling correction also synonyms for “very” (really) and for “picky” (particular, fussy, finicky) will be returned.
Recommendations are style and vocabulary suggestions.
There are three types of recommendations currently supported:
- Spelling out numbers: 0-9 replaced with the spelled-out number (when in the context of counting something)
- Overused Adjectives: alternative adjectives suggested whenever commonplace/banal ones are being used
- Flagging of passive voice usage. The active form is not suggested, only the fact that passive voice was used is flagged.
Examples:
-
There are 2 very good reason for this.
Without any flags, only the mistake reason->reasons will be corrected.
With generateRecommendations=true, in addition to the grammar correction also the “2” will be spelled out to “two”
-
I live in a big house.
Without any flags, there will be no correction.
With generateRecommendations=true, there will be two suggestions for the adjective “big” (large, spacious)
-
The action was taken by her
Without any flags, there will be no correction.
With generateRecommendations=true, the use of the passive voice will be flagged
General
A correction category specifies the grammatical topic a suggestion returned by the API belongs to. A category is identified by a unique numeric id which is returned in the LrnCatId field, which is part of each member of the Suggestions array in the response:
Each suggestion has a category associated with it. Thus, if there are multiple suggested corrections for a specific word, each will have its own, potentially different, category. For example, in the sentence: “Book is on the shelf” there are two suggested corrections for “Book”: “The book” and “A book”. The first gets LrnCatId = 13 (DefiniteArticle), the second gets LrnCatId = 12 (IndefiniteArticle).
{ "Suggestions": [ { "LrnCatId": 43, "Text": "boy's" }, { "LrnCatId": 43, "Text": "boys'" } ] }
Following is the full list of correction grammatical categories the Ginger API may return. Each category is listed with its id and name.
The following section describes the types of mistakes captured by each category, with examples.
Spelling
Fizix is a great sudgekt
The marble statue had a big hed
This problem is to complicated
I wasn't sure what to except
I want to rid a camel
We decided to remove the item form our store
The bed room is comfortable
This type of behavior can mad den me
This looks makebelieve, not real
I amnot going there
The colour purple is my favorite [colour corrected to color is lang=US]
I live near the town center[center is corrected to centre if lang=UK]
My friend john is not well today
We will always have paris
My english teacher is stern
My freudian slip turned out to be fatal
I went to to the store
I won't let let her do it
John is studying for a MBA degree
This is an great show
This is great show
I had time of my life on this vacation
These are some of things I have to deal with
Sheryl went to the tickets office
My wife name is Sara
The boys teacher is not coming today
We bought a number of item
Six people lost their life in the accident
I sleep in a small beds
I need you help
Mary and me just had a long conversation
This movie is bad than anything I have seen
She is the most pretty girl in her class
This circle is more round than it seems
Jane couldn’t located your phone number
It is important to submitted the paper on time
Deploying this in the cloud will allowing scaling
The battery is lasting for only 2 hours
I have develop this idea for days
Has we met before?
Terry has writing a letter at the moment
I am go to the market
I go to the store yesterday
Yesterday I download an app for long distance calls
My parents has never been to Florida before this summer
Tom have already left work before he realized he forgot his notes
He was sell fruit on the side of the road
I am having a problem with my connection yesterday
Amy try going to the chess club tomorrow
I'll will review the test later
A bouquet of yellow roses lend color and fragrance to the room
My aunt or my uncle are arriving by train today
I refuse give up
She expects it is ready by five
The tutor allowed us talking as long as it was in English
We are looking for employees with proved experience
My dream has giving me strength to lead others
It's take Sam two hours to get home
She is a beautifully woman
He closed the door quiet
He sings good
We arrived to the station
The differences among English, Chinese, and Arabic are significant
Do you have a good picture I can incorporate in the presentation?
You can get there with bus or on foot
I arrived in five o'clock
She lives in 666 Elm Street
The relevant data appeared on the rightmost column
this is not right
what a nice house!
Oh well let's go for it
Students will demonstrate understanding of spoken words syllables and sounds (phonemes)
98 West Pulaski Road Huntington Station NY
Is this really true
How did you manage to do it
Can you make me a favor?
I want to stay till the finish of the party
They said us to come quickly
A lotta things depend on his success
I coulda made it
He lives in a big house
He is a nice guy
The job was done by him
The deed is being done as we speak
We raise 3 cats and a newborn baby
The bad weather delayed me by 5 days
The trip was boring, but the road was beautiful “tiring” will be suggested as a synonym for “boring” and “route” as a synonym for “road”
How I would love a good night's sleep! “slumber” will be suggested as a synonym for “sleep”