Skip to content
This repository has been archived by the owner on Dec 29, 2022. It is now read-only.

disunification and naming suggestions from Karl Pentzlin #64

Closed
GoogleCodeExporter opened this issue Jun 2, 2015 · 3 comments
Closed

Comments

@GoogleCodeExporter
Copy link

Quick comments on some Emoji symbols
 - Karl Pentzlin 2009-01-06

Reference:
http://www.unicode.org/~scherer/emoji4unicode/snapshot/full.html
as of 2009-01-06

The base for some of the comments are:
- Symbols which are not merely glyph variants of each other should not
 be unified; if someone can address different semantics to two
 symbols they are different symbols, even when they are used
 interchangeably in the Japanese Telco context. When encoded in
 Unicode, the context is no more limited such.
- Symbols should be named as they appear as emoji, not according
 to the black-and-white fallback glyph which is associated to
 them to print the Unicode charts. This means:
 · Symbols with an inherent color shall bear this color in their
   name unless the entity denoted by the name has identifies the color
   anyway (e.g., a BANANA is uniquely yellow and therefore does
   not need to be called YELLOW BANANA, while a RED APPLE must be
   named so as there are also green apples).
 · Symbols which semantics include animation shall have ANIMATED
   as part of its name (this does not apply to symbols where
   animation is a feature of glyph variance only).

All symbol names are relative to any generic prefixes which are applied
to the set of emoji symbols  or subsets of it during the ongoing
discussion.

Any comment starting with "KDDI is", "DoCoMo is", or "SoftBank is"
is a request to not unify this with the other symbols of the same row.

e-004 KDDI is THUNDERSTORM WITH RAIN
e-008 SoftBank is NIGHT WITH FALLING STAR
e-014 should be named otherwise e.g. MOON-LIKE CRESCENT,
     as a crescent moon must have its tips strictly opposite on
     the enclosing circle. Naming this CRESCENT MOON is an offence
     to anybody who knows the astronomical mechanisms.
e-02b...e-037: General comments sent by a previous mail.
e-036 The KDDI symbol shows one fish, while PISCES is plural.
     Therefore, to complete the pictorial Zodiac set a picture
     of two fishes is needed, while the KDDI symbol is "fish".
e-038 SoftBank is TSUNAMI (??)
e-03A should be named ERUPTING VOLCANO (in contrast to the Mount
     Fuji symbol which may be required to be named VOLCANO to
     avoid geographical preferences).
e-040 DoCoMo and SoftBank are PINK CHERRY BLOSSOM or JAPANESE CHERRY BLOSSOM
     (Some European cherry trees blossom in white)
     KDDI is PINK BLOSSOMING CHERRY TREE or JAPANESE BLOSSOMING CHERRY
     TREE
e-044 just not to be listed under "nature", the symbol seems unequivocally
     to be the newly licensed driver plate.
     The name JAPANESE NEW LICENSED DRIVER SIGN seems preferable.
e-051 is RED APPLE
e-057 is WATER MELON - most melons sold in Europe are yellow and oval
e-05B is GREEN APPLE
e-190 is COMIC EYES or EYEBALLS
e-193 seems to be RED LIPS rather than generic MOUTH
e-197 is ANIMATED FACE MESSAGE
e-198 SoftBank is HAIRCUT (??)
e-19F is MAN WOMAN PAIR
e-1A1 is POLICEMANS HEAD WITH FLAT CAP
     (in other countries, police caps may look definitively different)
     if there is a police cap by SoftBank, this is a different FLAT POLICE CAP
e-1A2 KDDI is WOMANS HEAD WITH BUNNY EARS
     SoftBank is TWO DANCING WOMEN WITH BUNNY EARS
e-1A3 is BRIDES HEAD WITH VEIL
e-1A4 at first glance: KDDI is BLOND WOMAN, SoftBank is BLOND MAN
     It seems appropriate to recategorize:
     e-19D            DARK-HAIRED MANS HEAD
     e-19E            DARK-HAIRED WOMANS HEAD
     e-1A4 (KDDI)     BRIGHT-HAIRED WOMANS HEAD
     e-1A4 (SoftBank) BRIGHT-HAIRED MANS HEAD
     e-19B            DARK-HAIRED BOYS HEAD
     e-19B (variant)  BRIGHT-HAIRED BOYS HEAD
     e-19C (SoftBank) DARK-HAIRED GIRLS HEAD
     e-19C (KDDI)     BRIGHT-HAIRED GIRLS HEAD
e-1A5 is MANS HEAD WITH LONG MOUSTACHE
     For reasons of political correctness, there must be two characters:
     DARK-HAIRED MANS HEAD WITH LONG MOUSTACHE
     BRIGHT-HAIRED MANS HEAD WITH LONG MOUSTACHE
     Otherwise, some traditional Bavarians which use to wear long blonde
     moustaches may be offended.
e-1A6 is MANS HEAD WITH TURBAN
     *** It is *STRONGLY* objected to show this icon with another skin color
         than the others
         ***
     Alternatively, it has to be scrutinized whether ALL person and head
     symbols have to be differentiated by BRIGHT SKINNED, BROWN SKINNED
     and DARK SKINNED versions in a politically correct way which is acceptable
     to all people in the world.
e-1A7 is OLDER MANS HEAD
e-1A8 is OLDER WOMANS HEAD
e-1A9 is BABYS HEAD
e-1AA is CONSTRUCTION WORKERS HEAD WITH HELMET
e-1AB is YOUNG BRIGHT-HAIRED PRINCESS HEAD or BRIGHT-HAIRED GIRLS HEAD WITH 
CROWN
e-1AC is RED FACED OGRES HEAD
e-1AD is LONG-NOSED GOBLINS HEAD
e-1AF is PUTTO ANGEL (simply ANGEL may be offensive to some religious people)
e-1B0 KDDI is ALIEN SPACESHIP, SoftBank is BIG-EYED ALIEN FACE
e-1B2 is FACE WITH DEVILS HORNS (simply DEVIL may be offensive to some
     religious or superstitious people)
e-1B6 KDDI is ANIMATED MALE DISCO DANCER,
     SoftBank is ANIMATED FEMALE FLAMENCO DANCER
e-1B7 is DOG FACE, SoftBank is PUPPY FACE, similarly
     e-1B8,1BF,1C0,1C1,1C2,1CA,1D1,1D2,1D7,154 add " FACE" like it is done for 
e-1C4
     MONKEY FACE
e-1BD see comment for e-036
e-1C8 is SITTING WHITE BIRD
e-1D0 is FOX HEAD
e-353 is ANIMATED BOWING FACE
e-357 is ANIMATED PERSON RAISING ONE HAND, SoftBank is PALM OF HAND
e-358 is ANIMATED PERSON RAISING BOTH HANDS, SoftBank is
     ANIMATED PAIR OF HANDS OPENING AND CLOSING
e-359 is ANIMATED PERSON FROWNING
e-35A is ANIMATED PERSON MAKING POUTING FACE
e-35B SoftBank is PAIR OF RAISED FOLDED HANDS
e-4B0 is SMALL HOUSE
e-4B4 is HOSPITAL DENOTED BY CROSS SYMBOL
e-4B5 is ALPHABETIC BANK SYMBOL
e-4b6 is AUTOMATIC TELLER MACHINE SYMBOL
e-4b7 DoCoMo is LATIN LETTER H ENCLOSED IN A HOUSE SYMBOL
e-4C2 is RED LANTERN DENOTING JAPANESE IZAKAYA RESTAURANT
     ("red lantern" is a symbol for two totally different concepts in European
      culture: a. brothel, b. being the last one in a sports competition)
e-4CA is WORKERS MALLET (as it looks different from the common household 
hammer)
e-4CC is MENS LOW SHOE
e-4D2 is TRIDENT (the listing under Clothing/Wearables is wrong)
e-4D5 is LADIES FORMAL DRESS
e-4D7 is MULE SHOE or similar, to denote it is not the animal called mule
e-4DD should be encoded as an enclosing combining mark MONEY BAG, which can
     be applied to any currency symbol
e-4DE is DOLLAR YEN CURRENCY EXCHANGE
e-4DF is CHART WITH RISING CURVE AND YEN SYMBOL
e-4EF is SINGLE-LENS REFLEX STILL PICTURE CAMERA
e-4F4 is FAECES or PICTORIAL EXPRESSION OF DISDAIN
e-4F7 is CRYSTAL BALL ON RACK
e-4Fa is MEAT CLEAVER
e-4FB is TORCH
e-4FD this is a nonstandard symbol for window scrolling and must be named in
     a way that it is not mistaken for any ISO 7000 or similar symbol;
     thus it must get a prefix like JAPANESE TELCO SYMBOL if it gets
     no generic name prefix for the emoji set or a subset
e-4FE is ELECTRIC PLUG WITH CABLE
e-4FF is GREEN CLOSED BOOK LYING WITH BACK TO THE RIGHT
     in this way applicable to books to be read from right to left
e-500 is BLUE CLOSED BOOK LYING WITH BACK TO THE RIGHT
e-501 is ORANGE CLOSED BOOK LYING WITH BACK TO THE RIGHT
e-502 is FRONT OF GREEN BOOK WITH LABEL or FRONT OF GREEN NOTEBOOK WITH LABEL
e-503 is STACK OF BOOKS LYING WITH BACK TO THE LEFT
e-505 KDDI is WOMANS HEAD WITH BATHING CAP, SoftBank is PERSON TAKING A BATH
e-506 is LADIES AND GENTS RESTROOMS SIGN
e-509 is SYRINGE WITH DROP OF BLOOD
e-50B/C/D/E: depending on the way coloring of those emojis which are unique
     when disregarding color are treated eventually: If the black-and-white
     equivalents are to be encoded:, these are:
     KDDI: new      U+1F130 SQUARED LATIN LETTER A
           existing U+1F131 SQUARED LATIN LETTER B
           new      U+1F1xx SQUARED DIGIT 0
           new      U+1F1xx SQUARED AB
     SoftBank:  new U+1F150 WHITE ON BLACK CIRCLED LATIN CAPITAL LETTER A
                new U+1F151 WHITE ON BLACK CIRCLED LATIN CAPITAL LETTER B
                new U+1F1xx WHITE ON BLACK CIRCLED DIGIT 0
                    (probably to be unified with U+24FF NEGATIVE CIRCLED DIGIT 
ZERO)
                new U+1F1xx WHITE ON BLACK CIRCLED AB
e+513 is SANTA CLAUS FACE
e+515 is ANIMATED NIGHT SKY WITH FIREWORKS
e+517 is ANIMATED PARTY POPPER
e+51D is ANIMATED NIGHT SKY WITH JAPANESE SPARKLER
e+520 is ANIMATED OPENING CONFETTI BALL
----------- Comments for emoji symbols starting from e+522 may follow later.


Original issue reported on code.google.com by markus.icu on 6 Jan 2009 at 8:25

@GoogleCodeExporter
Copy link
Author

My initial reply on the emoji4unicode list:

On colors: We considered symbol colors for disunification but rarely for 
character 
names. Instead, with UTC guidance, we unified a number of symbols with existing 
characters which have black/white/striped... glyphs and names. For newly 
proposed 
symbols, we followed the precedent and chose similar character names, matching 
the 
glyphs in the font that is being worked on.

On disunifications: At a glance, it looks like many of the suggested 
disunifications 
assume more specific and precise meanings and shapes than are intended by the 
cell 
phone carriers. For example,
- If a symbol generally looks like a crescent moon (e-014) and is described or 
named 
by the carriers to represent one, it makes little sense to give it a different 
meaning based on an imprecise symbol shape. (What we can do is design a better 
glyph.)
- If a carrier clearly intends a certain meaning, and shows that in name, 
shape, 
context of surrounding symbols and maybe other available information, we should 
follow that meaning and not artificially invent a separate symbol and meaning. 
(e-036 
pisces vs. KDDI single fish)
- The carriers' understanding of "glyph variants", as expressed in symbol names 
and 
cross-mapping tables, is clearly broader than your sense of "glyph variants". 
For 
interoperability, we usually try to follow the carriers' cross-mappings, except 
when 
they are way off (as in e-7E0 subway vs. e-7E1 metro sign, which has been 
discussed 
by the UTC before).


Original comment by markus.icu on 6 Jan 2009 at 8:45

@GoogleCodeExporter
Copy link
Author

Ken Whistler's reply on the emoji4unicode list:

> Quick comments on some Emoji symbols
>   - Karl Pentzlin 2009-01-06
>
> Reference:
> http://www.unicode.org/~scherer/emoji4unicode/snapshot/full.html
> as of 2009-01-06
>
> The base for some of the comments are:
> - Symbols which are not merely glyph variants of each other should not
>   be unified; if someone can address different semantics to two
>   symbols they are different symbols, even when they are used
>   interchangeably in the Japanese Telco context. When encoded in
>   Unicode, the context is no more limited such.

I disagree. It is true that encoding a character for a symbol in
Unicode puts it in a context where it might not always be
limited to transcoding for the Japanese wireless sets, so that
due consideration must be given to how this is done. However,
when what we are encoding is a compability character for an emoji
which is *already* unified by de facto mappings between the
various carrier sets, it is not helpful -- in fact is disruptive --
to disunify glyph variants simply because the telcos use different
glyphs to display the cross-mapped character in question.

In such cases, as for the zodiac symbols which you wrote a
separate note on (and which Markus responded to), the correct
encoding solution here is to treat the cross-mapped emoji as
a *single* character for encoding, and then to either encode
a new single Unicode character (if no existing Unicode character
is appropriate) or to map to a single Unicode character if
one already exists -- as for the zodiac signs.

If a separate need occurs in the future to distinguish
animal-pictorial representations of zodiac signs, for example,
from traditional astrological symbolic representations of
zodiac signs, that needs to be done in a separate context
and be separately argued from the current emoji set -- because
separately encoding them on the basis merely of the distinct
glyphs used by the wireless carriers would *not* be a helpful
or useful solution to the emoji cross-mapping to Unicode problem.

> - Symbols should be named as they appear as emoji, not according
>   to the black-and-white fallback glyph which is associated to
>   them to print the Unicode charts. This means:
>   · Symbols with an inherent color shall bear this color in their
>     name unless the entity denoted by the name has identifies the color
>     anyway (e.g., a BANANA is uniquely yellow and therefore does
>     not need to be called YELLOW BANANA, while a RED APPLE must be
>     named so as there are also green apples).

I disagree. This principle is simply not helpful. It perpetuates
the notion that colors are *inherently* a part of the character
identity here. And that does not serve the purpose of providing
a cross-mapping set for interoperability with the emoji characters.
It would be far, far better to simply have some abstracted
compability characters identified as EMOJI SYMBOL FOR BOOK-1,
EMOJI SYMBOL FOR BOOK-2, EMOJI SYMBOL FOR BOOK-3, etc., rather
than to insist on encoding RED BOOK SYMBOL, BLUE BOOK SYMBOL,
ORANGE BOOK SYMBOL, and then jump off the deep end in insisting
that the associated glyphs actually need to support color
distinctions.

>   · Symbols which semantics include animation shall have ANIMATED
>     as part of its name (this does not apply to symbols where
>     animation is a feature of glyph variance only).

I disagree. This is the same issue as for the colored glyphs,
only more so. It is simply not helpful to insist that
"ANIMATED" be part of the character name, when that is a
description of the animated glyphs used on phones, rather
than a useful identifying label for the *character* we are
going to encode to represent the symbol in question.

> All symbol names are relative to any generic prefixes which are applied
> to the set of emoji symbols  or subsets of it during the ongoing
> discussion.
>
> Any comment starting with "KDDI is", "DoCoMo is", or "SoftBank is"
> is a request to not unify this with the other symbols of the same row.

And I will simply put my comment in as opposing *all* such
disunifications across the board, without objecting to each
individual suggestion one-by-one below. I think this whole
approach is a very deep semiotic trap that completely
misconstrues both the problem and the nature of the solution
required for cross-mapping the emoji sets in Unicode.

> e-1A5 is MANS HEAD WITH LONG MOUSTACHE
>       For reasons of political correctness, there must be two characters:
>       DARK-HAIRED MANS HEAD WITH LONG MOUSTACHE
>       BRIGHT-HAIRED MANS HEAD WITH LONG MOUSTACHE
>       Otherwise, some traditional Bavarians which use to wear long blonde
>       moustaches may be offended.

This is an example of the kind of dead end that this approach
results in. The problem here is to create a standard mapping
code point in Unicode for the emoji symbol listed at e-1A5.
The problem is *not* to solve some generic issue of how to
represent all races, skin colors, and masculine facial hair
styles politically correctly via character codes.

> e-1B6 KDDI is ANIMATED MALE DISCO DANCER,
>       SoftBank is ANIMATED FEMALE FLAMENCO DANCER

That is another example of a completely unhelpful disunification,
as well as an example of the inappropriate application
of "ANIMATED" to a character name. The symbolic concept
being represented here is of a dancer. The glyphs chosen
on the phones to display that concept are animated and
designed differently. But encoding distinct characters and
making them overly specific to glyph designs is simply not
a useful direction to take for the character encoding for
the purpose intended here.

I could make similar comments one-by-one, but it should be clear
that I object to the complete set of comments in principle,
rather than just here and there on its details.


Original comment by markus.icu on 6 Jan 2009 at 9:23

@GoogleCodeExporter
Copy link
Author

On reviewing these suggestions more closely, I agree with Ken that the 
suggestions 
are based on an overly pedantic interpretation of the carrier images, 
disregarding 
both common practice of naming Unicode symbols as well as the carriers' cross-
mappings.

Original comment by markus.icu on 14 Jan 2009 at 5:39

  • Changed state: WontFix

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant