PHP6的新特性:Unicode和TextIterator
Jun 08, 2016 pm 05:32 PM我剛剛安裝了PHP6 DEV版本,決定測試一下PHP6的新特性-PHP的Unicode支持。我并沒有打算講PHP6的新特性或者是Unicode,下面僅僅是我做的關(guān)于Unicode的測試。
首先要做的是讓php6支持Unicode,在php.ini文件中修改。
;;;;;;;;;;;;;;;;;;;;
; Unicode settings ;
;;;;;;;;;;;;;;;;;;;;unicode.semantics = on
unicode.runtime_encoding = utf-8
unicode.script_encoding = utf-8
unicode.output_encoding = utf-8
unicode.from_error_mode = U_INVALID_SUBSTITUTE
unicode.from_error_subst_char = 3f
由于我使用的是法語和英語有所不同,有一些字符需要處理。
所以,我第一次試驗的目的是檢驗strlen功能的Unicode …
$word = "être";
echo "Length: ".strlen($word);
結(jié)果是: Length: 4? 。結(jié)果非常的正確… …但它僅僅是個開始! : )
我的第二個測試對象是與PHP6新的SPL中的TextIterator textiterator
$word = "être";
foreach (new TextIterator($word, TextIterator::CHARACTER) as $character) {
? var_inspect($character);
}
輸出: unicode(1) “ê” { 00ea } unicode(1) “t” { 0074 } unicode(1) “r” { 0072 } unicode(1) “e” { 0065 }
分解單詞,得到了很多的字母和字母的信息…
TextIterator::CHARACTER的操作看上去非常的強大啊,不過TextIterator::WORD更強大
$sentences = "Bonjour, nous sommes Fran?ais ! A?e :)";
foreach (new TextIterator($sentences, TextIterator::WORD) as $word) {
??? var_inspect($word);
}
得到的結(jié)果: unicode(7) “Bonjour” { 0042 006f 006e 006a 006f 0075 0072 } unicode(1) “,” { 002c } unicode(1) ” ” { 0020 } unicode(4) “nous” { 006e 006f 0075 0073 } unicode(1) ” ” { 0020 } unicode(6) “sommes” { 0073 006f 006d 006d 0065 0073 } unicode(1) ” ” { 0020 } unicode(8) “Fran?ais” { 0046 0072 0061 006e 00e7 0061 0069 0073 } unicode(1) ” ” { 0020 }

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

In-depth understanding of PHP: Implementation method of converting JSONUnicode to Chinese During development, we often encounter situations where we need to process JSON data, and Unicode encoding in JSON will cause us some problems in some scenarios, especially when Unicode needs to be converted When encoding is converted to Chinese characters. In PHP, there are some methods that can help us achieve this conversion process. A common method will be introduced below and specific code examples will be provided. First, let us first understand the Un in JSON

Unicode is a character encoding standard used to represent various languages ??and symbols. To convert Unicode encoding to Chinese characters, you can use Python's built-in functions chr() and ord().

JSON (JavaScriptObjectNotation) is a lightweight data exchange format commonly used for data exchange between web applications. When processing JSON data, we often encounter Unicode-encoded Chinese characters (such as "u4e2du6587") and need to convert them into readable Chinese characters. In PHP, we can achieve this conversion through some simple methods. Next, we will detail how to convert JSONUnico

Are you troubled by Chinese garbled characters in Eclipse? To try these solutions, you need specific code examples 1. Background introduction With the continuous development of computer technology, Chinese plays an increasingly important role in software development. However, many developers encounter garbled code problems when using Eclipse for Chinese development, which affects work efficiency. Then, this article will introduce some common garbled code problems and give corresponding solutions and code examples to help readers solve the Chinese garbled code problem in Eclipse. 2. Common garbled code problems and solution files

php提交表單通過后,彈出的對話框怎樣在當(dāng)前頁彈出php提交表單通過后,彈出的對話框怎樣在當(dāng)前頁彈出而不是在空白頁彈出?想實現(xiàn)這樣的效果:而不是空白頁彈出:------解決方案--------------------如果你的驗證用PHP在后端,那么就用Ajax;僅供參考:HTML code

The differences between unicode and ascii include different encoding ranges, different storage spaces, and different compatibility. Detailed introduction: 1. The encoding range is different. The encoding range of ASCII is 0-127, which is mainly used to represent English letters. The encoding range of Unicode is much wider and can represent almost all language characters; 2. The storage space is different. ASCII usually Use 1 byte to store a character, while unicode may use 2 or more bytes to store a character; 3. Different compatibility, etc.

With the development of technologies such as big data and cloud computing, databases have become one of the important cornerstones of enterprise informatization. In applications developed in Java, connecting to MySQL database has become the norm. However, in this process, we often encounter a thorny problem - inconsistent Unicode character set encoding. This will not only affect our development efficiency, but also affect the performance and stability of the application. This article will introduce how to solve this problem and make Java connect to the MySQL database more smoothly. 1. Unicode

Sequential access Sequential access is a basic operation for processing strings in the Java language. Under this approach, each character in the input string is accessed sequentially from beginning to end, or sometimes from end to beginning. This section discusses seven technical examples of creating a 32-bit code point array from a string using sequential access methods and estimates their processing time. Example 1-1: Benchmark (no support for surrogate pairs) Listing 1 assigns a 16-bit char type value directly to a 32-bit code point value, without taking the surrogate pair into account at all: Listing 1. No support for surrogate pairs int[]toCodePointArray(Stringstr) {//Example1-1intlen=str.length();//t
