J_Inflect



Development Release 0.3.6 - July 27, 2000
Japanese Verb and Adjective Inflection/Analysis Engine

This is a program I wrote as a learning exercise and to create a learning tool for inflective forms (verbs and adjectives) in Japanese. It is a work in progress, and is a command-line tool at the moment. I'm making it available to solicit comments and corrections. Please try it out and let me know if you like it. As I develop, I will keep the latest working version here. I plan to get back to development work on it soon, but I can't make any promises.

What it does:

  • Forward conjugate Japanese verbs and adjectives into many different inflectional forms. Results can be output with inflected kanji and kana to an EUC text file. For words listed in the dictionary files, fully realized English translations are provided.
  • Reverse conjugate an inflectional form, matching it against possible candidates from a dictionary file.
Downloads
Current as of July 27, 2000

Windows (95/98/NT) v.0.3.6: J_Inflect.zip (417 KB)
If you have a problem downloading this, please e-mail me at cmmcculley AT charter DOT net, and I can send you the file through e-mail.
Note [4/28/01]: I have packaged in a new rules file that contains a number of new inflections and has been better organized. The original is still there, and you can rename the files to use it instead if you want to. I did not remake the Mac distribution file, but the new verb rules file from this archive will work for Mac also. (If you don't have something that will read a ZIP file, e-mail me and I will send the rules file to you).

Macintosh PPC v.0.3.6 pre-release: J_Inflect.sit (360 KB)

Documentation for v.0.3.6
(Note, both download files have documentation included)

Version 0.3.6 for Windows has been released as a DOS console application.

The PC/Win version of this program was using a console emulator which unfortunately had some bugs. J_Inflect has been recast as a console (DOS-Window) application in order to make this release available. On Win98 there isn't much you can do to make life easier as far as information scrolling off the screen, except perhaps limiting queries to one tense at a time, or sending output to a file. On WinNT (or 2000, presumably), it is suggested that under the "Layout" tab in the Console control panel you set both width values to 85. Set the window height value to make the window as tall as possible while still fitting on your screen. Set as large a buffer height as is comfortable (500 lines or more) to enable the console scroll bar and enable you to see results that scroll off the screen. That being done, you should be able to double-click the application (.exe) file, as usual to start a console window with J_Inflect.

The new command "exit" will stop the program.

Version 0.3.6 contains fully functional reverse conjugation, provided by the inclusion of full-sized verb and adjective dictionary files extracted from Jim Breen's EDICT file. Here is the full list of changes:

  • Changed the format of the Words_Verbs.dat and Words_Adj.dat files so that they no longer contain a redundant roomaji entry. The program now converts internally to roomaji from kana. In addition, dynamic English translation entries are optional. If an entry doesn't have them, the standard [to do] form translation is used instead.
  • Distilled from EDICT new Words_Verbs.dat and Words_Adj.dat files. There are now 8777 available verbs (including about 3000 verbs using suru as an auxiliary), and 799 entries in the adjectives file. Because EDICT does not mark true adjectives, these were cut by hand from a list of words ending in the syllable -i. It is possible (probable in fact) that some words cut were actually adjectives with un-adjective-like definitions, or else that words that were left are not actually adjectives. Please report any omissions or non-adjectives that you find so that corrections can be made to the next version of the file.
  • Added the command 'dtest' to do a full conjugation of the contents of either of the dictionary data files. This has been used to find and fix problems conjugating certain forms, test the coherence of the data files, and discover problems with the internal kana/roomaji table.
  • Fixed problems with the kana/roomaji table, including missing syllables, support for katakana, and support for combinations and punctuation peculiar to katakana.
  • Added a trap in the roomaji to kana translator to prevent invalid roomaji strings from being translated. This prevents a misapplied rule from crashing the program, producing instead a message in the kana reading that the rule application is invalid. (It should be obvious to someone reading the roomaji, e.g. "sosnasaimase" that the rule produced a nonsensical result).
  • Added a help command to give the user on-line reminders of available commands, command syntax and available switches.

Bugs Fixed/Features Added in Currently Available Version

0.3.5 released June 11, 2000:

  • Removed the different infinitive/inflection kanji entries from the dictionary, as it turns out these are unnecessary.
  • Added dynamic English translation. For verbs in the Words_Verbs.dat file, translations are now given based directly on the verb being conjugated. For instance, conjugating "taberu" gives "I ate" for the past indicative. To support this, the data file now contains entries for the infinitive (without "to"), 3rd person present, present participle, plain past, and past participle of the primary English translation of the verb. These are directly substituted for [do], [does], [doing], [did], and [done] in the translation template, respectively. For adjectives, an additional field in the dictionary specifies the primary English translation to be substituted for [X] in the rules file.
  • Restructured the verb rules file to remove redundancy and make the file easier to read and edit.
  • Added command rconj which does reverse conjugation. Given an inflection as an argument it will find all words in the dictionary files that can conjugate to that inflection, and present them to the user. It searches verbs by default, but can take switch -a which will search the adjective dictionary, or switch -b which will search in both. If a single result is found, it becomes the current word. If multiple results, the user can choose one using command cw with switch -r.
    NOTE: The dictionary files are only test versions that are woefully incomplete, so although this functionality works, it requires more available data to be really useful.

Previous Changes



To the Japanese Language & Culture Page




Collin McCulley
cmmcculley AT charter DOT net
Last updated December 16, 2001.