Selected Unicode Mathematical Alphanumeric Characters
Version of Wednesday 8 May 2024.
Dave Barber's other pages.

§1 Introduction. The Unicode standard includes hundreds of letters and digits in special fonts intended to be useful to mathematicians, residing largely in the character block of code point range 1D400-1D7FF. This report organizes a large subset of them, as well as related characters from elsewhere within Unicode.

Unicode Technical Report #25 provides guidance and rationales for use of mathematical characters.

The present author has a report that covers Unicode's rendering of the Arabic digits in various scripts.

§2 Listings of characters. Below are links to pages with detailed information, mainly in table format, about Unicode's mathematical alphanumeric characters.

§2A. Every Unicode character is represented by a number (or occasionally a short sequence of numbers) known as its code point. Code points are the definitive way to identify a character. Glyphs do not suffice for this purpose, because two different fonts might provide substantially different glyphs for the same character; or because two different characters might have glyphs that look similar or identical.

§2B. For each character in these listings, the code point is given in the context of a numeric character reference (NCR) as used in HTML. Although decimal numbers can be used in an NCR, this report always uses hexadecimal numbers, as is the practice in official Unicode documentation.

Further, the tables generally include character entity references (CERs) when they exist. For example, either the NCR &#x212C; or the CER &Bscr; yields the glyph ℬ (table L-5). Either kind of reference can be copied-and-pasted directly into HTML source code. In non-HTML contexts, this character would often be represented by the symbol U+212C.

§2C. In the early days of Unicode, it was anticipated that 16 bits, equivalent to four hexadecimal digits, would be enough to represent all desired characters. From this arose the custom to write code points with four digits, adding leading zeroes as necessary. For example, the character whose code point is 61 ("a") would be denoted as U+0061. This report, however, omits leading zeroes in the NCRs.

Unicode has grown, and 21 bits are necessary to encode all the currently-defined characters. This means that many of the newest characters require 5 hexadecimal digits for writing their code points, such as U+1D76F (𝝯). Unicode has defined over 149,000 characters, and few available fonts attempt to render them all; and in many fonts characters are rendered inconsistently. The current Unicode architecture can accomodate growth to 32-bit character designations.

§2D. For each entry in a table, shown are:

• the character as it appears in the font of the user's browser;
• the NCR;
• one or more CERs if they exist.

Throughout the tables, the characters are generally displayed in columns which are tinted various colors for ease of reading, mostly in alphabetical or numerical order. In some tables, the plainest version of each character, which is not always suitable for precise mathematical typography, is included in a gray column at the left for comparison.

Yellow cells emphasize characters whose code point is out of numerical sequence; irregularities arise in part because Unicode is an evolving standard. Many categories of characters, once created in part, are subsequently expanded using whatever code numbers are still available at the later time.

In the tables, superscripts and subscripts are shown inside a pair of reverse brackets for comparison. Some other classes of characters receive special contexts explained at the point of use.

Latin
letters
table L-1 — Latin letters, sans-serifsample:

plain
text
non-italic
non-bold
non-italic
bold
italic
non-bold
italic
bold
A
&#x41;
a
&#x61;
𝖠
&#x1D5A0;
𝖺
&#x1D5BA;
𝗔
&#x1D5D4;
𝗮
&#x1D5EE;
𝘈
&#x1D608;
𝘢
&#x1D622;
𝘼
&#x1D63C;
𝙖
&#x1D656;
table L-2 — Latin letters, avec-serifsample:

plain
text
italic
non-bold
non-italic
bold
italic
bold
A
&#x41;
a
&#x61;
𝐴
&#x1D434;
𝑎
&#x1D44E;
𝐀
&#x1D400;
𝐚
&#x1D41A;
𝑨
&#x1D468;
𝒂
&#x1D482;
table L-3 — Latin letters, monospacedsample:

ordinary fullwidth
𝙰
&#x1D670;
𝚊
&#x1D68A;

&#xFF21;

&#xFF41;
table L-4 — Latin letters, enclosedsample:

 ⒶⒶ ⓐⓐ 🅐🅐 🄰🄰 🅰🅰 🄐🄐 ⒜⒜
table L-5 — Latin letters, miscellaneoussample:

 double-struck scriptnon-bold scriptbold frakturnon-bold frakturbold outlineUnicode 16.0 𝔸𝔸𝔸 𝕒𝕒𝕒 𝒜𝒜𝒜 𝒶𝒶𝒶 𝓐𝓐 𝓪𝓪 𝔄𝔄𝔄 𝔞𝔞𝔞 𝕬𝕬 𝖆𝖆 𜳖𜳖
Greek
letters
table G-1 — Greek letters, mainsample:

plain
text
avec-serif
non-italic
bold
avec-serif
italic
non-bold
avec-serif
italic
bold
sans-serif
non-italic
bold
sans-serif
italic
bold
Α
&#x391;
&Alpha;
α
&#x3B1;
&alpha;
𝚨
&#x1D6A8;
𝛂
&#x1D6C2;
𝛢
&#x1D6E2;
𝛼
&#x1D6FC;
𝜜
&#x1D71C;
𝜶
&#x1D736;
𝝖
&#x1D756;
𝝰
&#x1D770;
𝞐
&#x1D790;
𝞪
&#x1D7AA;
table G-2 — Greek letters, miscellaneoussample:

plain
text
avec-serif
non-italic
bold
avec-serif
italic
non-bold
avec-serif
italic
bold
sans-serif
non-italic
bold
sans-serif
italic
bold

&#x2207;
&nabla;

&#x2202;
&part;
𝛁
&#x1D6C1;
𝛛
&#x1D6DB;
𝛻
&#x1D6FB;
𝜕
&#x1D715;
𝜵
&#x1D735;
𝝏
&#x1D74F;
𝝯
&#x1D76F;
𝞉
&#x1D789;
𝞩
&#x1D7A9;
𝟃
&#x1D7C3;
Numerals table N-1 — Numerals, generalsample:

 plaintext sans-serifnon-bold sans-serifbold avec-serifbold ordinarymonospace fullwidthmonospace double-struck superscript subscript outlineUnicode 16.0 segmentedUnicode 16.0 00 𝟢𝟢 𝟬𝟬 𝟎𝟎 𝟶𝟶 ００ 𝟘𝟘 ]⁰[⁰ ]₀[₀ 𜳰𜳰 🯰🯰
table N-2 — Enclosed numeralssample:

 ➀➀ ⓵⓵ ❶❶ ➊➊ ⓫⓫ ⑴⑴ ⑾⑾
table N-3 — Roman numeralssample:

value plain textRoman   valuechars valuechars
1 I
&#x49;
i
&#x69;

&#x2160;

&#x2170;
500 ⅠↃ 1,000 ⅭⅠↃ
table N-4 — Greek numeralssample:

 1 ΑΑΑ ααα 10 ΙΙΙ ιιι 100 ΡΡΡ ρρρ
table N-5 — Cyrillic numeralssample:

 1 ААА 10 ІІІ ЇЇЇ 100 РРР
table N-6 — Fractionssample:

 ½½½½ ⅓⅓⅓ ¼¼¼ ⅕⅕⅕ ⅙⅙⅙ ⅐⅐ ⅛⅛⅛ ⅑⅑ ⅒⅒ ⅟⅟
Combining
characters
table series D — Diacriticssample:

 Ẏẏ Ẏẏ Ỵỵ Ỵỵ Ÿÿ Ÿÿ Y̤y̤ Y̤y̤

§3 Other scripts. Latin and Greek letters receive extensive support for the font variations required by mathematicians: sans-serif, avec-serif, italic, bold, et cetera. By contrast, Cyrillic letters, which resemble Greek letters, receive little accomodation for the needs of mathematicians. Cyrillic_numerals, which resemble Greek numerals, get only basic Unicode coverage.

Benetia et alii discuss Arabic mathematical symbols in Unicode.

Two systems of Braille notation for mathematics are Nemeth and Gardnerâ€“Salinas. Unicode Braille defines 256 dot patterns, but does not specify what they might mean.

Unicode has special mathematical characters for the first four letters of the Hebrew alphabet, as below. No rationale is evident for omitting the rest of the alphabet. Hebrew is read from right to left.

 plain text math use Table H-1 Four Unicode Hebrew letters דד גג בב אא ℸℸℸ ℷℷℷ ℶℶℶ ℵℵℵ

§4 What Unicode does not do. As Unicode is a character set, and not a markup language, it does not attempt to provide comprehensive support for the typography of superscripts, subscripts, and fractions. To provide an example of what might be done in a markup language, however, here are some ways of effecting these in HTML:

HTML source coderesult
base<sup>superscript</sup> basesuperscript
base<sub>subscript</sub> basesubscript
<sup>numer</sup>&frasl;<sub>denom</sub> numerdenom

HTML allows superscripts and subscripts (hence numerators and denominators) to be nested, although the results may be difficult to read correctly. In fact, HTML allows almost anything to be nested when it makes logical sense.

Unfortunately, there is no guarantee that the three following ways of rendering the number one-half will yield identical glyphs:

HTML source coderesult
<sup>1</sup>&frasl;<sub>2</sub> 12
&frac12; ½
&#x215F;<sub>2</sub> 2

§5 Documentation of the Unicode standard.

Unicode blocks pertinent to this report
code pointssubject
0000- 007FBasic Latin
0080- 00FFLatin-1 Supplement
0100- 017FLatin Extended-A
0180- 024FLatin Extended-B
0250- 02AFIPA Extensions
0300- 036FCombining Diacritical Marks
0370- 03FFGreek and Coptic
0400- 04FFCyrillic
0590- 05FFHebrew
1AB0- 1AFFCombining Diacritical Marks Extended
1D00- 1D7FPhonetic Extensions
1DC0- 1DFFCombining Diacritical Marks Supplement
2000- 206FGeneral Punctuation
2070- 209FSuperscripts and Subscripts
20D0- 20FFCombining Diacritial Marks for Symbols
2100- 214FLetterlike Symbols
2150- 218FNumber Forms
2200- 22FFMathematical Operators
code pointssubject
2300- 23FFMiscellaneous Technical
2460- 24FFEnclosed Alphanumerics
25A0- 25FFGeometric Shapes
2700- 27BFDingbats
2800- 28FFBraille Patterns
2C00- 2C5FGlagolitic
3200- 32FFEnclosed CJK Letters and Months
A640- A69FCyrillic Extended-B
A720- A7FFLatin Extended-D
FB00- FB4FAlphabetic Presentation Forms
FE20- FE2FCombining half marks
FF00- FFEFHalfwidth and Fullwidth Forms
10140-1018FAncient Greek Numbers
1D400-1D7FFMathematical Alphanumeric Symbols
1EE00-1EEFFArabic Mathematical Alphabetic Symbols
1F100-1F1FFEnclosed Alphanumeric Supplement
1FB00-1FBFFSymbols for Legacy Computing
There are hundreds of non-pertinent blocks.

Although the above documents do a throrough job of defining the standard, they are not always convenient for people seeking particular characters. Many web sites, including the present one, have sprung up to make such character searches easier; notable among them is Compart. Also, Wikibooks has a handy listing of mathematical characters.

§6 Miscellaneous.

Opinion.

Unicode is a very good thing. Of course it is not perfect, in part because a huge number of people have been involved in its development, and they have often had diverging views. Also, new insights occasionally emerge that tend to alter its direction of progress. Still, Unicode fully deserves the near-ubiquity it has achieved throughout the world's computers. In particular, Unicode's UTF-8 compression standard has been a great success.

Colophon.

In the original work of year 2022, the pages containing the tables in the L-, G-, and N- series were generated by a custom-written C++ program, which also generated the samples that appear on this page. This was done in a Mac Xcode environment. Other parts of this page were created directly with a text editor. Then they were all combined with a Unix script. This indirect approach was made necessary in order to manage the many characters to be treated, and the need to devise a consistent format for organizing them, which in turn required seemingly endless tinkering.

In the extensive revisions of year 2024, design of the project had stabilized, simplifying further development. At this point the C++ program was discarded, with further changes being made directly in the HTML files using the Xcode text editor, which is HTML-aware.