Do you think IIT Guwahati certified course can help you in your career?
No
Introduction
Ruby is a high-level, general-purpose, interpreted programming language which supports numerous programming paradigms. In this blog, you will learn how to specify program encoding in ruby. You will also learn about encoding in general and various encoding techniques practised in the modern world. So without ado, let's begin learning encoding and its implementation.
Encoding refers to mapping numbers to graphical characters, especially the characters used in human language, allowing them to be transmitted, stored, and translated using digital machines. Computers only understand the binary sequences of characters; therefore, it becomes mandatory to convert human-readable characters into binary form, which takes the help of encoding.
Types of Encoding:
There are various encoding schemes such as ASCII, UTF-8, UTF-16, Hex, Base64, SJIS, etc. Multiple encoding techniques allow for the incorporation of different languages worldwide. The two famous encoding techniques are discussed below:
ASCII: It is a macro for American Standard Code for Information Interchange. This is a character encoding standard for digital communication. ASCII codes represent text in computers, communications equipment, and other electronic devices.
ASCII TABLE
Letter
ASCII
Binary
Letter
ASCII
Code
Binary
a
097
01100001
A
065
01000001
b
098
01100010
B
066
01000010
c
099
01100011
C
067
01000011
d
100
01100100
D
068
01000100
e
101
01100101
E
069
01000101
f
102
01100110
F
070
01000110
g
103
01100111
G
071
01000111
h
104
01101000
H
072
01001000
i
105
01101001
I
073
01001001
j
106
01101010
J
074
01001010
k
107
01101011
K
075
01001011
l
108
01101100
L
076
01001100
m
109
01101101
M
077
01001101
n
110
01101110
N
078
01001110
o
111
01101111
O
079
01001111
p
112
01110000
P
080
01010000
q
113
01110001
Q
081
01010001
r
114
01110010
R
082
01010010
s
115
01110011
S
083
01010011
t
116
01110100
T
084
01010100
u
117
01110101
U
085
01010101
v
118
01110110
V
086
01010110
w
119
01110111
W
087
01010111
x
120
01111000
X
088
01011000
y
121
01111001
Y
089
01011001
z
122
01111010
Z
090
01011010
UTF-8: It is a variable byte character encoding standard used for digital communication. The name comes from “Unicode Transformation Format - 8-bit”, defined by Unicode Standard. UTF-8 can encode all 1.112.064 valid character code points. It uses one to four one-byte code units. UTF-8 encoded files are identified if the first three bytes of files are 0xEF 0xBB 0xBF. These bytes are the BOM or “Byte Order Mark” optional in UTF-8-encoded files.
Specifying program encoding in Ruby
Ruby interpreter, by default, assumes that programs are encoded in ASCII.
You may specify a different encoding with the -K command-line option.
To use UTF-8 encoding in Ruby programs, use the -Ku command.
-Ke and -Ks options are used to run EUC-JP or SJIS encodings.
In Ruby 1.9, the script’s author can specify the encoding by using a particular ‘coding comment’ at the beginning of the file.
For example: # coding: utf-8
From Ruby 1.9, users don’t have to specify the encoding. It's done from the developer's side.
The coding comment must be strictly written in ASCII and include the string coding. This must be followed by a colon or equal sign and the name of the desired encoding.
The coding comment is not case sensitive and can be written in upper or lower case.
Encoding comments are usually only valid on the first line of the file. Ruby scans for a comment on the first or second line that contains the string ‘encoding’.
If the first line is a shebang comment, the encoding comment shifts to the second line.
Example: #!/usr/bin/ruby -w
# coding: utf-8
Ruby also supports source encodings such as ASCII-8BIT, US-ASCII, ISO-8859-1, SJIS and EUC-JP.
The language keyword __ENCODING__ evaluates the source encoding of the currently executing code.
Frequently Asked Questions
What is the advantage of using Ruby?
It is secure, flexible, and open-source with native multi-threading support.
What is an interpreter?
Interpreters are computer programs that allow direct execution of code or script without the need for compilation first.
What is a shebang comment in Ruby?
This comment makes the Ruby script executable on Unix-like operating systems.
What is a scripting language?
It is a language for run-time systems that automates the execution of tasks.
How many characters are listed in ASCII encoding?
ASCII is a 7-bit character set that contains 128 characters
Conclusion
In this blog, you learned about specifying program encoding, encoding in general, and encoding types. The knowledge gained here will help you write Ruby scripts that will run in various languages.
After reading about these encodings in ruby, are you not feeling excited to read more articles on the topic of Ruby? Don’t worry; Coding Ninjas has you covered. To learn, see Ruby_Coding_Ninjas, Ruby on Rails.
For further in-depth information, you may follow official Ruby documentation, faqs and koans.