README.EXT_SKEL
1(NOTE: you may also want to take a look at the pear package
2 PECL_Gen, a PHP-only alternative for this script that
3 supports way more extension writing tasks and is
4 supposed to replace ext_skel completely in the long run ...)
5
6WHAT IT IS
7
8 It's a tool for automatically creating the basic framework for a PHP module
9 and writing C code handling arguments passed to your functions from a simple
10 configuration file. See an example at the end of this file.
11
12HOW TO USE IT
13
14 Very simple. First, change to the ext/ directory of the PHP 4 sources. If
15 you just need the basic framework and will be writing all the code in your
16 functions yourself, you can now do
17
18 ./ext_skel --extname=module_name
19
20 and everything you need is placed in directory module_name.
21
22 [ Note that GNU awk is likely required for this script to work. Debian
23 systems seem to default to using mawk, so you may need to change the
24 #! line in skeleton/create_stubs and the cat $proto | awk line in
25 ext_skel to use gawk explicitly. ]
26
27 If you don't need to test the existence of any external header files,
28 libraries or functions in them, the module is already almost ready to be
29 compiled in PHP. Just remove 3 comments in your_module_name/config.m4,
30 change back up to PHP sources top directory, and do
31
32 ./buildconf; ./configure --enable-module_name; make
33
34 But if you already have planned the overall scheme of your module, what
35 functions it will contain, their return types and the arguments they take
36 (a very good idea) and don't want to bother yourself with creating function
37 definitions and handling arguments passed yourself, it's time to create a
38 function definitions file, which you will give as an argument to ext_skel
39 with option
40
41 --proto=filename.
42
43FORMAT OF FUNCTION DEFINITIONS FILE
44
45 All the definitions must be on one line. In it's simplest form, it's just
46 the function name, e.g.
47
48 my_function
49
50 but then you'll be left with an almost empty function body without any
51 argument handling.
52
53 Arguments are given in parenthesis after the function name, and are of
54 the form 'argument_type argument_name'. Arguments are separated from each
55 other with a comma and optional space. Argument_type can be one of int,
56 bool, double, float, string, array, object or mixed.
57
58 An optional argument is separated from the previous by an optional space,
59 then '[' and of course comma and optional space, like all the other
60 arguments. You should close a row of optional arguments with same amount of
61 ']'s as there where '['s. Currently, it does not harm if you forget to do it
62 or there is a wrong amount of ']'s, but this may change in the future.
63
64 An additional short description may be added after the parameters.
65 If present it will be filled into the 'proto' header comments in the stubs
66 code and the <refpurpose> tag in the XML documentation.
67
68 An example:
69
70 my_function(int arg1, int arg2 [, int arg3 [, int arg4]]) this is my 1st
71
72 Arguments arg3 and arg4 are optional.
73
74 If possible, the function definition should also contain it's return type
75 in front of the definition. It's not actually used for any C code generating
76 purposes but PHP in-source documentation instead, and as such, very useful.
77 It can be any of int, double, string, bool, array, object, resource, mixed
78 or void.
79
80 The file must contain nothing else but function definitions, no comments or
81 empty lines.
82
83OTHER OPTIONS
84
85 --no-help
86
87 By default, ext_skel creates both comments in the source code and a test
88 function to help first time module writers to get started and testing
89 configuring and compiling their module. This option turns off all such things
90 which may just annoy experienced PHP module coders. Especially useful with
91
92 --stubs=file
93
94 which will leave out also all module specific stuff and write just function
95 stubs with function value declarations and passed argument handling, and
96 function entries and definitions at the end of the file, for copying and
97 pasting into an already existing module.
98
99 --assign-params
100 --string-lens
101
102 By default, function proto 'void foo(string bar)' creates the following:
103 ...
104 zval **bar;
105 ... (zend_get_parameters_ex() called in the middle...)
106 convert_to_string_ex(bar);
107
108 Specifying both of these options changes the generated code to:
109 ...
110 zval **bar_arg;
111 int bar_len;
112 char *bar = NULL;
113 ... (zend_get_parameters_ex() called in the middle...)
114 convert_to_string_ex(bar_arg);
115 bar = Z_STRVAL_PP(bar_arg);
116 bar_len = Z_STRLEN_PP(bar_arg);
117
118 You shouldn't have to ask what happens if you leave --string-lens out. If you
119 have to, it's questionable whether you should be reading this document.
120
121 --with-xml[=file]
122
123 Creates the basics for phpdoc .xml file.
124
125 --full-xml
126
127 Not implemented yet. When or if there will ever be created a framework for
128 self-contained extensions to use phpdoc system for their documentation, this
129 option enables it on the created xml file.
130
131CURRENT LIMITATIONS, BUGS AND OTHER ODDITIES
132
133 Only arguments of types int, bool, double, float, string and array are
134 handled. For other types you must write the code yourself. And for type
135 mixed, it wouldn't even be possible to write anything, because only you
136 know what to expect.
137
138 It can't handle correctly, and probably never will, variable list of
139 of arguments. (void foo(int bar [, ...])
140
141 Don't trust the generated code too much. It tries to be useful in most of
142 the situations you might encounter, but automatic code generation will never
143 beat a programmer who knows the real situation at hand. ext_skel is generally
144 best suited for quickly generating a wrapper for c-library functions you
145 might want to have available in PHP too.
146
147 This program doesn't have a --help option. It has --no-help instead.
148
149EXAMPLE
150
151 The following _one_ line
152
153 bool my_drawtext(resource image, string text, resource font, int x, int y [, int color])
154
155 will create this function definition for you (note that there are a few
156 question marks to be replaced by you, and you must of course add your own
157 value definitions too):
158
159/* {{{ proto bool my_drawtext(resource image, string text, resource font, int x, int y[, int color])
160 */
161PHP_FUNCTION(my_drawtext)
162{
163 zval **image, **text, **font, **x, **y, **color;
164 int argc;
165 int image_id = -1;
166 int font_id = -1;
167
168 argc = ZEND_NUM_ARGS();
169 if (argc < 5 || argc > 6 || zend_get_parameters_ex(argc, &image, &text, &font, &x, &y, &color) == FAILURE) {
170 WRONG_PARAM_COUNT;
171 }
172
173 ZEND_FETCH_RESOURCE(???, ???, image, image_id, "???", ???_rsrc_id);
174 ZEND_FETCH_RESOURCE(???, ???, font, font_id, "???", ???_rsrc_id);
175
176 switch (argc) {
177 case 6:
178 convert_to_long_ex(color);
179 /* Fall-through. */
180 case 5:
181 convert_to_long_ex(y);
182 convert_to_long_ex(x);
183 /* font: fetching resources already handled. */
184 convert_to_string_ex(text);
185 /* image: fetching resources already handled. */
186 break;
187 default:
188 WRONG_PARAM_COUNT;
189 }
190
191 php_error(E_WARNING, "my_drawtext: not yet implemented");
192}
193/* }}} */
194
195
README.EXTENSIONS
1Between PHP 4.0.6 and 4.1.0, the Zend module struct changed in a way
2that broke both source and binary compatibility. If you are
3maintaining a third party extension, here's how to update it:
4
5If this was your old module entry:
6
7zend_module_entry foo_module_entry = {
8 "foo", /* extension name */
9 foo_functions, /* extension function list */
10 NULL, /* extension-wide startup function */
11 NULL, /* extension-wide shutdown function */
12 PHP_RINIT(foo), /* per-request startup function */
13 PHP_RSHUTDOWN(foo), /* per-request shutdown function */
14 PHP_MINFO(foo), /* information function */
15 STANDARD_MODULE_PROPERTIES
16};
17
18Here's how it should look if you want your code to build with PHP
194.1.0 and up:
20
21zend_module_entry foo_module_entry = {
22#if ZEND_MODULE_API_NO >= 20010901
23 STANDARD_MODULE_HEADER,
24#endif
25 "foo", /* extension name */
26 foo_functions, /* extension function list */
27 NULL, /* extension-wide startup function */
28 NULL, /* extension-wide shutdown function */
29 PHP_RINIT(foo), /* per-request startup function */
30 PHP_RSHUTDOWN(foo), /* per-request shutdown function */
31 PHP_MINFO(foo), /* information function */
32#if ZEND_MODULE_API_NO >= 20010901
33 FOO_VERSION, /* extension version number (string) */
34#endif
35 STANDARD_MODULE_PROPERTIES
36};
37
38If you don't care about source compatibility with earlier PHP releases
39than 4.1.0, you can drop the #if/#endif lines.
40
README.input_filter
1Input Filter Support in PHP 5
2-----------------------------
3
4XSS (Cross Site Scripting) hacks are becoming more and more prevalent,
5and can be quite difficult to prevent. Whenever you accept user data
6and somehow display this data back to users, you are likely vulnerable
7to XSS hacks.
8
9The Input Filter support in PHP 5 is aimed at providing the framework
10through which a company-wide or site-wide security policy can be
11enforced. It is implemented as a SAPI hook and is called from the
12treat_data and post handler functions. To implement your own security
13policy you will need to write a standard PHP extension.
14
15A simple implementation might look like the following. This stores the
16original raw user data and adds a my_get_raw() function while the normal
17$_POST, $_GET and $_COOKIE arrays are only populated with stripped
18data. In this simple example all I am doing is calling strip_tags() on
19the data. If register_globals is turned on, the default globals that
20are created will be stripped ($foo) while a $RAW_foo is created with the
21original user input.
22
23ZEND_BEGIN_MODULE_GLOBALS(my_input_filter)
24 zval *post_array;
25 zval *get_array;
26 zval *cookie_array;
27ZEND_END_MODULE_GLOBALS(my_input_filter)
28
29#ifdef ZTS
30#define IF_G(v) TSRMG(my_input_filter_globals_id, zend_my_input_filter_globals *, v)
31#else
32#define IF_G(v) (my_input_filter_globals.v)
33#endif
34
35ZEND_DECLARE_MODULE_GLOBALS(my_input_filter)
36
37zend_function_entry my_input_filter_functions[] = {
38 PHP_FE(my_get_raw, NULL)
39 {NULL, NULL, NULL}
40};
41
42zend_module_entry my_input_filter_module_entry = {
43 STANDARD_MODULE_HEADER,
44 "my_input_filter",
45 my_input_filter_functions,
46 PHP_MINIT(my_input_filter),
47 PHP_MSHUTDOWN(my_input_filter),
48 NULL,
49 PHP_RSHUTDOWN(my_input_filter),
50 PHP_MINFO(my_input_filter),
51 "0.1",
52 STANDARD_MODULE_PROPERTIES
53};
54
55PHP_MINIT_FUNCTION(my_input_filter)
56{
57 ZEND_INIT_MODULE_GLOBALS(my_input_filter, php_my_input_filter_init_globals, NULL);
58
59 REGISTER_LONG_CONSTANT("POST", PARSE_POST, CONST_CS | CONST_PERSISTENT);
60 REGISTER_LONG_CONSTANT("GET", PARSE_GET, CONST_CS | CONST_PERSISTENT);
61 REGISTER_LONG_CONSTANT("COOKIE", PARSE_COOKIE, CONST_CS | CONST_PERSISTENT);
62
63 sapi_register_input_filter(my_sapi_input_filter);
64 return SUCCESS;
65}
66
67PHP_RSHUTDOWN_FUNCTION(my_input_filter)
68{
69 if(IF_G(get_array)) {
70 zval_ptr_dtor(&IF_G(get_array));
71 IF_G(get_array) = NULL;
72 }
73 if(IF_G(post_array)) {
74 zval_ptr_dtor(&IF_G(post_array));
75 IF_G(post_array) = NULL;
76 }
77 if(IF_G(cookie_array)) {
78 zval_ptr_dtor(&IF_G(cookie_array));
79 IF_G(cookie_array) = NULL;
80 }
81 return SUCCESS;
82}
83
84PHP_MINFO_FUNCTION(my_input_filter)
85{
86 php_info_print_table_start();
87 php_info_print_table_row( 2, "My Input Filter Support", "enabled" );
88 php_info_print_table_row( 2, "Revision", "$Revision$");
89 php_info_print_table_end();
90}
91
92/* The filter handler. If you return 1 from it, then PHP also registers the
93 * (modified) variable. Returning 0 prevents PHP from registering the variable;
94 * you can use this if your filter already registers the variable under a
95 * different name, or if you just don't want the variable registered at all. */
96SAPI_INPUT_FILTER_FUNC(my_sapi_input_filter)
97{
98 zval new_var;
99 zval *array_ptr = NULL;
100 char *raw_var;
101 int var_len;
102
103 assert(*val != NULL);
104
105 switch(arg) {
106 case PARSE_GET:
107 if(!IF_G(get_array)) {
108 ALLOC_ZVAL(array_ptr);
109 array_init(array_ptr);
110 INIT_PZVAL(array_ptr);
111 }
112 IF_G(get_array) = array_ptr;
113 break;
114 case PARSE_POST:
115 if(!IF_G(post_array)) {
116 ALLOC_ZVAL(array_ptr);
117 array_init(array_ptr);
118 INIT_PZVAL(array_ptr);
119 }
120 IF_G(post_array) = array_ptr;
121 break;
122 case PARSE_COOKIE:
123 if(!IF_G(cookie_array)) {
124 ALLOC_ZVAL(array_ptr);
125 array_init(array_ptr);
126 INIT_PZVAL(array_ptr);
127 }
128 IF_G(cookie_array) = array_ptr;
129 break;
130 }
131 Z_STRLEN(new_var) = val_len;
132 Z_STRVAL(new_var) = estrndup(*val, val_len);
133 Z_TYPE(new_var) = IS_STRING;
134
135 var_len = strlen(var);
136 raw_var = emalloc(var_len+5); /* RAW_ and a \0 */
137 strcpy(raw_var, "RAW_");
138 strlcat(raw_var,var,var_len+5);
139
140 php_register_variable_ex(raw_var, &new_var, array_ptr TSRMLS_DC);
141
142 php_strip_tags(*val, val_len, NULL, NULL, 0);
143
144 *new_val_len = strlen(*val);
145 return 1;
146}
147
148PHP_FUNCTION(my_get_raw)
149{
150 long arg;
151 char *var;
152 int var_len;
153 zval **tmp;
154 zval *array_ptr = NULL;
155 HashTable *hash_ptr;
156 char *raw_var;
157
158 if(zend_parse_parameters(2 TSRMLS_CC, "ls", &arg, &var, &var_len) == FAILURE) {
159 return;
160 }
161
162 switch(arg) {
163 case PARSE_GET:
164 array_ptr = IF_G(get_array);
165 break;
166 case PARSE_POST:
167 array_ptr = IF_G(post_array);
168 break;
169 case PARSE_COOKIE:
170 array_ptr = IF_G(post_array);
171 break;
172 }
173
174 if(!array_ptr) RETURN_FALSE;
175
176 /*
177 * I'm changing the variable name here because when running with register_globals on,
178 * the variable will end up in the global symbol table
179 */
180 raw_var = emalloc(var_len+5); /* RAW_ and a \0 */
181 strcpy(raw_var, "RAW_");
182 strlcat(raw_var,var,var_len+5);
183 hash_ptr = HASH_OF(array_ptr);
184
185 if(zend_hash_find(hash_ptr, raw_var, var_len+5, (void **)&tmp) == SUCCESS) {
186 *return_value = **tmp;
187 zval_copy_ctor(return_value);
188 } else {
189 RETVAL_FALSE;
190 }
191 efree(raw_var);
192}
193
194
README.MAILINGLIST_RULES
1====================
2 Mailinglist Rules
3====================
4
5This is the first file you should be reading before doing any posts on PHP
6mailinglists. Following these rules is considered imperative to the success of
7the PHP project. Therefore expect your contributions to be of much less positive
8impact if you do not follow these rules. More importantly you can actually
9assume that not following these rules will hurt the PHP project.
10
11PHP is developed through the efforts of a large number of people.
12Collaboration is a Good Thing(tm), and mailinglists lets us do this. Thus,
13following some basic rules with regards to mailinglist usage will:
14
15 a. Make everybody happier, especially those responsible for developing PHP
16 itself.
17
18 b. Help in making sure we all use our time more efficiently.
19
20 c. Prevent you from making a fool of yourself in public.
21
22 d. Increase the general level of good will on planet Earth.
23
24
25Having said that, here are the organizational rules:
26
27 1. Respect other people working on the project.
28
29 2. Do not post when you are angry. Any post can wait a few hours. Review
30 your post after a good breather or a good nights sleep.
31
32 3. Make sure you pick the right mailinglist for your posting. Please review
33 the descriptions on the mailinglist overview page
34 (http://www.php.net/mailing-lists.php). When in doubt ask a friend or
35 someone you trust on IRC.
36
37 4. Make sure you know what you are talking about. PHP is a very large project
38 that strives to be very open. The flip side is that the core developers
39 are faced with a lot of requests. Make sure that you have done your
40 research before posting to the entire developer community.
41
42 5. Patches have a much greater chance of acceptance than just asking the
43 PHP developers to implement a feature for you. For one it makes the
44 discussion more concrete and it shows that the poster put thought and time
45 into the request.
46
47 6. If you are posting to an existing thread, make sure that you know what
48 previous posters have said. This is even more important the longer the
49 thread is already.
50
51 7. Please configure your email client to use a real name and keep message
52 signatures to a maximum of 2 lines if at all necessary.
53
54The next few rules are more some general hints:
55
56 1. If you notice that your posting ratio is much higher than that of other
57 people, double check the above rules. Try to wait a bit longer before
58 sending your replies to give other people more time to digest your answers
59 and more importantly give you the opportunity to make sure that you
60 aggregate your current position into a single mail instead of multiple
61 ones.
62
63 2. Consider taking a step back from a very active thread now and then. Maybe
64 talking to some friends and fellow developers will help in understanding
65 the other opinions better.
66
67 3. Do not top post. Place your answer underneath anyone you wish to quote
68 and remove any previous comment that is not relevant to your post.
69
70 4. Do not high-jack threads, by bringing up entirely new topics. Please
71 create an entirely new thread copying anything you wish to quote into the
72 new thread.
73
74Finally, additional hints on how to behave inside the virtual community can be
75found in RFC 1855 (http://www.faqs.org/rfcs/rfc1855.html).
76
77Happy hacking,
78
79PHP Team
80
README.namespaces
1Design
2======
3
4Main assumption of the model is that the problem that we are to solve is the
5problem of the very long class names in PHP libraries. We would not attempt
6to take autoloader's job or create packaging model - only make names
7manageable.
8
9Namespaces are defined the following way:
10
11Zend/DB/Connection.php:
12<?php
13namespace Zend\DB;
14
15class Connection {
16}
17
18function connect() {
19}
20?>
21
22Namespace definition does the following:
23All class and function names inside are automatically prefixed with
24namespace name. Inside namespace, local name always takes precedence over
25global name. Several files may be using the same namespace.
26The namespace declaration statement must be the very first statement in
27the file. The only exception is "declare" statement that can be used before.
28
29Every class and function in a namespace can be referred to by the full name
30- e.g. Zend\DB\Connection or Zend\DB\connect - at any time.
31
32<?php
33require 'Zend/Db/Connection.php';
34$x = new Zend\DB\Connection;
35Zend\DB\connect();
36?>
37
38Namespace or class name can be imported:
39
40<?php
41require 'Zend/Db/Connection.php';
42use Zend\DB;
43use Zend\DB\Connection as DbConnection;
44
45$x = new Zend\DB\Connection();
46$y = new DB\connection();
47$z = new DbConnection();
48DB\connect();
49?>
50
51The use statement only defines name aliasing. It may create name alias for
52namespace or class. The simple form of statement "use A\B\C\D;" is
53equivalent to "use A\B\C\D as D;". The use statement can be used at any
54time in the global scope (not inside function/class) and takes effect from
55the point of definition down to the end of file. It is recommended however to
56place the use statements at the beginning of the file. The use statements have
57effect only on the file where they appear.
58
59The special "empty" namespace (\ prefix) is useful as explicit global
60namespace qualification. All class and function names started from \
61interpreted as global.
62
63<?php
64namespace A\B\C;
65
66$con = \mysql_connect(...);
67?>
68
69A special constant __NAMESPACE__ contains the name of the current namespace.
70It can be used to construct fully-qualified names to pass them as callbacks.
71
72<?php
73namespace A\B\C;
74
75function foo() {
76}
77
78set_error_handler(__NAMESPACE__ . "\foo");
79?>
80
81In global namespace __NAMESPACE__ constant has the value of empty string.
82
83Names inside namespace are resolved according to the following rules:
84
851) all qualified names are translated during compilation according to
86current import rules. So if we have "use A\B\C" and then "C\D\e()"
87it is translated to "A\B\C\D\e()".
882) unqualified class names translated during compilation according to
89current import rules. So if we have "use A\B\C" and then "new C()" it
90is translated to "new A\B\C()".
913) inside namespace, calls to unqualified functions that are defined in
92current namespace (and are known at the time the call is parsed) are
93interpreted as calls to these namespace functions.
944) inside namespace, calls to unqualified functions that are not defined
95in current namespace are resolved at run-time. The call to function foo()
96inside namespace (A\B) first tries to find and call function from current
97namespace A\B\foo() and if it doesn't exist PHP tries to call internal
98function foo(). Note that using foo() inside namespace you can call only
99internal PHP functions, however using \foo() you are able to call any
100function from the global namespace.
1015) unqualified class names are resolved at run-time. E.q. "new Exception()"
102first tries to use (and autoload) class from current namespace and in case
103of failure uses internal PHP class. Note that using "new A" in namespace
104you can only create class from this namespace or internal PHP class, however
105using "new \A" you are able to create any class from the global namespace.
1066) Calls to qualified functions are resolved at run-time. Call to
107A\B\foo() first tries to call function foo() from namespace A\B, then
108it tries to find class A\B (__autoload() it if necessary) and call its
109static method foo()
1107) qualified class names are interpreted as class from corresponding
111namespace. So "new A\B\C()" refers to class C from namespace A\B.
112
113Examples
114--------
115<?php
116namespace A;
117foo(); // first tries to call "foo" defined in namespace "A"
118 // then calls internal function "foo"
119\foo(); // calls function "foo" defined in global scope
120?>
121
122<?php
123namespace A;
124new B(); // first tries to create object of class "B" defined in namespace "A"
125 // then creates object of internal class "B"
126new \B(); // creates object of class "B" defined in global scope
127?>
128
129<?php
130namespace A;
131new A(); // first tries to create object of class "A" from namespace "A" (A\A)
132 // then creates object of internal class "A"
133?>
134
135<?php
136namespace A;
137B\foo(); // first tries to call function "foo" from namespace "A\B"
138 // then calls method "foo" of internal class "B"
139\B\foo(); // first tries to call function "foo" from namespace "B"
140 // then calls method "foo" of class "B" from global scope
141?>
142
143The worst case if class name conflicts with namespace name
144<?php
145namespace A;
146A\foo(); // first tries to call function "foo" from namespace "A\A"
147 // then tries to call method "foo" of class "A" from namespace "A"
148 // then tries to call function "foo" from namespace "A"
149 // then calls method "foo" of internal class "A"
150\A\foo(); // first tries to call function "foo" from namespace "A"
151 // then calls method "foo" of class "A" from global scope
152?>
153
154TODO
155====
156
157* Support for namespace constants?
158
159* performance problems
160 - calls to internal functions in namespaces are slower, because PHP first
161 looks for such function in current namespace
162 - calls to static methods are slower, because PHP first tries to look
163 for corresponding function in namespace
164
165* Extend the Reflection API?
166 * Add ReflectionNamespace class
167 + getName()
168 + getClasses()
169 + getFunctions()
170 + getFiles()
171 * Add getNamespace() methods to ReflectionClass and ReflectionFunction
172
173* Rename namespaces to packages?
174
175
README.NEW-OUTPUT-API
1$Id$
2
3
4API adjustment to the old output control code:
5
6 Everything now resides beneath the php_output namespace,
7 and there's an API call for every output handler op.
8
9 Checking output control layers status:
10 // Using OG()
11 php_output_get_status(TSRMLS_C);
12
13 Starting the default output handler:
14 // php_start_ob_buffer(NULL, 0, 1 TSRMLS_CC);
15 php_output_start_default(TSRMLS_C);
16
17 Starting an user handler by zval:
18 // php_start_ob_buffer(zhandler, chunk_size, erase TSRMLS_CC);
19 php_output_start_user(zhandler, chunk_size, flags TSRMLS_CC);
20
21 Starting an internal handler whithout context:
22 // php_ob_set_internal_handler(my_php_output_handler_func_t, buffer_size, "output handler name", erase TSRMLS_CC);
23 php_output_start_internal(handler_name_zval, my_php_output_handler_func_t, chunk_size, flags TSRMLS_CC);
24
25 Starting an internal handler with context:
26 // not possible with old API
27 php_output_handler *h;
28 h = php_output_handler_create_internal(handler_name_zval, my_php_output_handler_context_func_t, chunk_size, flags TSRMLS_CC);
29 php_output_handler_set_context(h, my_context, my_context_dtor);
30 php_output_handler_start(h TSRMLS_CC);
31
32 Testing whether a certain output handler has already been started:
33 // php_ob_handler_used("output handler name" TSRMLS_CC);
34 php_output_handler_started(handler_name_zval TSRMLS_CC);
35
36 Flushing one output buffer:
37 // php_ob_end_buffer(1, 1 TSRMLS_CC);
38 php_output_flush(TSRMLS_C);
39
40 Flushing all output buffers:
41 // not possible with old API
42 php_output_flush_all(TSRMLS_C);
43
44 Cleaning one output buffer:
45 // php_ob_end_buffer(0, 1 TSRMLS_CC);
46 php_output_clean(TSRMLS_C);
47
48 Cleaning all output buffers:
49 // not possible with old API
50 php_output_clean_all(TSRMLS_C);
51
52 Discarding one output buffer:
53 // php_ob_end_buffer(0, 0 TSRMLS_CC);
54 php_output_discard(TSRMLS_C);
55
56 Discarding all output buffers:
57 // php_ob_end_buffers(0 TSRMLS_CC);
58 php_output_discard_all(TSRMLS_C);
59
60 Stopping (and dropping) one output buffer:
61 // php_ob_end_buffer(1, 0 TSRMLS_CC)
62 php_output_end(TSRMLS_C);
63
64 Stopping (and dropping) all output buffers:
65 // php_ob_end_buffers(1, 0 TSRMLS_CC);
66 php_output_end_all(TSRMLS_C);
67
68 Retrieving output buffers contents:
69 // php_ob_get_buffer(zstring TSRMLS_CC);
70 php_output_get_contents(zstring TSRMLS_CC);
71
72 Retrieving output buffers length:
73 // php_ob_get_length(zlength TSRMLS_CC);
74 php_output_get_length(zlength TSRMLS_CC);
75
76 Retrieving output buffering level:
77 // OG(nesting_level);
78 php_output_get_level(TSRMLS_C);
79
80 Issue a warning because of an output handler conflict:
81 // php_ob_init_conflict("to be started handler name", "to be tested if already started handler name" TSRMLS_CC);
82 php_output_handler_conflict(new_handler_name_zval, set_handler_name_zval TSRMLS_CC);
83
84 Registering a conflict checking function, which will be checked prior starting the handler:
85 // not possible with old API, unless hardcoding into output.c
86 php_output_handler_conflict_register(handler_name_zval, my_php_output_handler_conflict_check_t TSRMLS_CC);
87
88 Registering a reverse conflict checking function, which will be checked prior starting the specified foreign handler:
89 // not possible with old API
90 php_output_handler_reverse_conflict_register(foreign_handler_name_zval, my_php_output_handler_conflict_check_t TSRMLS_CC);
91
92 Facilitating a context from within an output handler callable with ob_start():
93 // not possible with old API
94 php_output_handler_hook(PHP_OUTPUT_HANDLER_HOOK_GET_OPAQ, (void *) &custom_ctx_ptr_ptr TSRMLS_CC);
95
96 Disabling of the output handler by itself:
97 //not possible with old API
98 php_output_handler_hook(PHP_OUTPUT_HANDLER_HOOK_DISABLE, NULL TSRMLS_CC);
99
100 Marking an output handler immutable by itself because of irreversibility of its operation:
101 // not possible with old API
102 php_output_handler_hook(PHP_OUTPUT_HANDLER_HOOK_IMMUTABLE, NULL TSRMLS_CC);
103
104 Restarting the output handler because of a CLEAN operation:
105 // not possible with old API
106 if (flags & PHP_OUTPUT_HANDLER_CLEAN) { ... }
107
108 Recognizing by the output handler itself if it gets discarded:
109 // not possible with old API
110 if ((flags & PHP_OUTPUT_HANDLER_CLEAN) && (flags & PHP_OUTPUT_HANDLER_FINAL)) { ... }
111
112
113Output handler hooks
114
115 The output handler can change its abilities at runtime. Eg. the gz handler can
116 remove the CLEANABLE and REMOVABLE bits when the first output has passed through it;
117 or handlers implemented in C to be used with ob_start() can contain a non-global
118 context:
119 PHP_OUTPUT_HANDLER_HOOK_GET_OPAQ
120 pass a void*** pointer as second arg to receive the address of a pointer
121 pointer to the opaque field of the output handler context
122 PHP_OUTPUT_HANDLER_HOOK_GET_FLAGS
123 pass a int* pointer as second arg to receive the flags set for the output handler
124 PHP_OUTPUT_HANDLER_HOOK_GET_LEVEL
125 pass a int* pointer as second arg to receive the level of this output handler
126 (starts with 0)
127 PHP_OUTPUT_HANDLER_HOOK_IMMUTABLE
128 the second arg is ignored; marks the output handler to be neither cleanable
129 nor removable
130 PHP_OUTPUT_HANDLER_HOOK_DISABLE
131 the second arg is ignored; marks the output handler as disabled
132
133
134Open questions
135
136 Should the userland API be adjusted and unified?
137
138 Many bits of the manual (and very first implementation) do not comply
139 with the behaviour of the current (to be obsoleted) code, thus should
140 the manual or the behaviour be adjusted?
141
142END
143
README.PARAMETER_PARSING_API
1New parameter parsing functions
2===============================
3
4It should be easier to parse input parameters to an extension function.
5Hence, borrowing from Python's example, there are now a set of functions
6that given the string of type specifiers, can parse the input parameters
7and store the results in the user specified variables. This avoids most
8of the IS_* checks and convert_to_* conversions. The functions also
9check for the appropriate number of parameters, and try to output
10meaningful error messages.
11
12
13Prototypes
14----------
15/* Implemented. */
16int zend_parse_parameters(int num_args TSRMLS_DC, char *type_spec, ...);
17int zend_parse_parameters_ex(int flags, int num_args TSRMLS_DC, char *type_spec, ...);
18
19The zend_parse_parameters() function takes the number of parameters
20passed to the extension function, the type specifier string, and the
21list of pointers to variables to store the results in. The _ex() version
22also takes 'flags' argument -- current only ZEND_PARSE_PARAMS_QUIET can
23be used as 'flags' to specify that the function should operate quietly
24and not output any error messages.
25
26Both functions return SUCCESS or FAILURE depending on the result.
27
28The auto-conversions are performed as necessary. Arrays, objects, and
29resources cannot be auto-converted.
30
31
32Type specifiers
33---------------
34 The following list shows the type specifier, its meaning and the parameter
35 types that need to be passed by address. All passed paramaters are set
36 if the PHP parameter is non optional and untouched if optional and the
37 parameter is not present. The only exception is O where the zend_class_entry*
38 has to be provided on input and is used to verify the PHP parameter is an
39 instance of that class.
40
41 a - array (zval*)
42 A - array or object (zval *)
43 b - boolean (zend_bool)
44 C - class (zend_class_entry*)
45 d - double (double)
46 f - function or array containing php method call info (returned as
47 zend_fcall_info and zend_fcall_info_cache)
48 h - array (returned as HashTable*)
49 H - array or HASH_OF(object) (returned as HashTable*)
50 l - long (long)
51 L - long, limits out-of-range numbers to LONG_MAX/LONG_MIN (long)
52 o - object of any type (zval*)
53 O - object of specific type given by class entry (zval*, zend_class_entry)
54 r - resource (zval*)
55 s - string (with possible null bytes) and its length (char*, int)
56 S - binary string, does not allow conversion from Unicode strings
57 t - text (zstr (string union), int (length), zend_uchar (IS_STRING/..))
58 accepts either Unicode or binary string
59 T - text (zstr (string union), int (length), zend_uchar (IS_STRING/..))
60 coalesces all T parameters to common type (Unicode or binary)
61 u - unicode (UChar*, int)
62 U - Unicode string, does not allow conversion from binary strings
63 x - Unicode or binary string depending on UG(unicode). In unicode this
64 behaves like 'u' and in nonunicode mode it behaves like 's'.
65 z - the actual zval (zval*)
66 Z - the actual zval (zval**)
67 * - variable arguments list (0 or more)
68 + - variable arguments list (1 or more)
69
70 The following characters also have a meaning in the specifier string:
71 | - indicates that the remaining parameters are optional, they
72 should be initialized to default values by the extension since they
73 will not be touched by the parsing function if they are not
74 passed to it.
75 / - use SEPARATE_ZVAL_IF_NOT_REF() on the parameter it follows
76 ! - the parameter it follows can be of specified type or NULL (applies
77 to all specifiers except for 'b', 'l', and 'd'). If NULL is passed, the
78 results pointer is set to NULL as well.
79 & - alternate format (currently used for 's' only to specify a converter to
80 use when converting from Unicode strings)
81 ^ - returns original string type before conversion (only for 's' and 'u'
82 specifiers)
83
84
85Note on 64bit compatibility
86---------------------------
87Please do not forget that int and long are two different things on 64bit
88OSes (int is 4 bytes and long is 8 bytes), so make sure you pass longs to "l"
89and ints to strings length (i.e. for "s" you need to pass char * and int),
90not the other way round!
91Remember: "l" is the only case when you need to pass long (and that's why
92it's "l", not "i" btw).
93
94Both mistakes cause memory corruptions and segfaults on 64bit OSes:
951)
96 char *str;
97 long str_len; /* XXX THIS IS WRONG!! Use int instead. */
98 zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s", &str, &str_len)
99
1002)
101 int num; /* XXX THIS IS WRONG!! Use long instead. */
102 zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "l", &num)
103
104If you're in doubt, use check_parameters.php script to the parameters
105and their types (it can be found in ./scripts/dev/ directory of PHP sources):
106
107# php ./scripts/dev/check_parameters.php /path/to/your/sources/
108
109
110Examples
111--------
112/* Gets a long, a string and its length, and a zval */
113long l;
114char *s;
115int s_len;
116zval *param;
117if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "lsz",
118 &l, &s, &s_len, ¶m) == FAILURE) {
119 return;
120}
121
122
123/* Gets an object of class specified by my_ce, and an optional double. */
124zval *obj;
125double d = 0.5;
126zend_class_entry *my_ce;
127if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "O|d",
128 &obj, my_ce, &d) == FAILURE) {
129 return;
130}
131
132
133/* Gets an object or null, and an array.
134 If null is passed for object, obj will be set to NULL. */
135zval *obj;
136zval *arr;
137if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "o!a",
138 &obj, &arr) == FAILURE) {
139 return;
140}
141
142
143/* Gets a separated array which can also be null. */
144zval *arr;
145if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "a/!",
146 &arr) == FAILURE) {
147 return;
148}
149
150
151/* Gets a binary string in UTF-8 */
152char *str;
153int str_len;
154
155if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s&", &str, &str_len, UG(utf8_conv)) == FAILURE) {
156 return;
157}
158
159
160/* Gets a Unicode string */
161UChar *str;
162int len;
163
164if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "u", &str, &len) == FAILURE) {
165 return;
166}
167
168
169/* Gets a Unicode or binary string */
170zstr str;
171int len;
172zend_uchar type;
173
174if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "t", &str, &len, &type) == FAILURE) {
175 return;
176}
177if (type == IS_UNICODE) {
178 /* process str.u as Unicode string */
179} else {
180 /* process str.s binary string */
181}
182
183
184/* Gets two string parameters, both of which will be guaranteed to be of the same type */
185zstr str1, str2;
186int len1, len2;
187zend_uchar type1, type2;
188
189if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "TT", &str1, &len1,
190 &type1, &str2, &len2, &type2) == FAILURE) {
191 return;
192}
193if (type1 == IS_UNICODE) {
194 /* process as Unicode, str2 is guaranteed to be Unicode as well */
195} else {
196 /* process as binary string, str2 is guaranteed to be the same */
197}
198
199
200/* Get either a set of 3 longs or a string. */
201long l1, l2, l3;
202char *s;
203/*
204 * The function expects a pointer to a integer in this case, not a long
205 * or any other type. If you specify a type which is larger
206 * than a 'int', the upper bits might not be initialized
207 * properly, leading to random crashes on platforms like
208 * Tru64 or Linux/Alpha.
209 */
210int length;
211
212if (zend_parse_parameters_ex(ZEND_PARSE_PARAMS_QUIET, ZEND_NUM_ARGS() TSRMLS_CC,
213 "lll", &l1, &l2, &l3) == SUCCESS) {
214 /* manipulate longs */
215} else if (zend_parse_parameters_ex(ZEND_PARSE_PARAMS_QUIET, ZEND_NUM_ARGS() TSRMLS_CC,
216 "s", &s, &length) == SUCCESS) {
217 /* manipulate string */
218} else {
219 /* output error */
220
221 return;
222}
223
224
225/* Function that accepts only varargs (0 or more) */
226
227int i, num_varargs;
228zval ***varargs = NULL;
229
230
231if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "*", &varargs, &num_varargs) == FAILURE) {
232 return;
233}
234
235for (i = 0; i < num_varargs; i++) {
236 /* do something with varargs[i] */
237}
238
239if (varargs) {
240 efree(varargs);
241}
242
243
244/* Function that accepts a string, followed by varargs (1 or more) */
245
246char *str;
247int str_len;
248int i, num_varargs;
249zval ***varargs = NULL;
250
251if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s+", &str, &str_len, &varargs, &num_varargs) == FAILURE) {
252 return;
253}
254
255for (i = 0; i < num_varargs; i++) {
256 /* do something with varargs[i] */
257}
258
259if (varargs) {
260 efree(varargs);
261}
262
263
264/* Function that takes an array, followed by varargs, and ending with a long */
265long num;
266zval *array;
267int i, num_varargs;
268zval ***varargs = NULL;
269
270if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "a*l", &array, &varargs, &num_varargs, &num) == FAILURE) {
271 return;
272}
273
274for (i = 0; i < num_varargs; i++) {
275 /* do something with varargs[i] */
276}
277
278if (varargs) {
279 efree(varargs);
280}
281
282
README.PEAR
README.PHP4-TO-PHP5-THIN-CHANGES
11. strrpos() and strripos() now use the entire string as a needle. Be aware
2 that the existing scripts may no longer work as you expect.
3
4 EX :
5 <?php
6 var_dump(strrpos("ABCDEF","DEF"));
7 var_dump(strrpos("ABCDEF","DAF"));
8 ?>
9
10 Will give you different results. The former returns 3 while the latter
11 returns false rather than the position of the last occurrence of 'D'.
12 The same applies to strripos().
13
142. Illegal use of string offsets causes E_ERROR instead of E_WARNING.
15
16 EX :
17 <?php
18 $a = "foo";
19 unset($a[0][1][2]);
20 ?>
21
22 Fatal error: Cannot use string offset as an array in ... on line 1
23
243. array_merge() was changed to accept only arrays. If a non-array variable is
25 passed, a E_WARNING will be thrown for every such parameter. Be careful
26 because your code may start emitting E_WARNING out of the blue.
27
284. Be careful when porting from ext/mysql to ext/mysqli. The following
29 functions return NULL when no more data is available in the result set
30 (ext/mysql's functions return FALSE).
31
32 - mysqli_fetch_row()
33 - mysqli_fetch_array()
34 - mysqli_fetch_assoc()
35
365. PATH_TRANSLATED server variable is no longer set implicitly under
37 Apache2 SAPI in contrast to the situation in PHP 4, where it is set to the
38 same value as the SCRIPT_FILENAME server variable when it is not populated
39 by Apache. This change was made to comply with the CGI specification.
40 Please refer to bug #23610 for further information.
41
426. Starting PHP 5.0.0 the T_ML_CONSTANT constant is no longer defined by the
43 ext/tokenizer extension. If error_reporting is set to E_ALL notices will
44 be produced. Instead of T_ML_CONSTANT for /* */ the T_COMMENT constant
45 is used, thus both // and /* */ are resolved as the T_COMMENT constant.
46 However the PHPDoc style comments /** */ ,which starting PHP 5 are parsed
47 by PHP, are recongnized as T_DOC_COMMENT.
48
497. $_SERVER should be populated with argc and argv if variables_order
50 includes "S". If you have specifically configured your system to not
51 create $_SERVER, then of course it shouldn't be there. The change was to
52 always make argc and argv available in the CLI version regardless of the
53 variables_order setting. As in, the CLI version will now always populate
54 the global $argc and $argv variables.
55
568. In some cases classes must be declared before used. It only happens only
57 if some of the new features of PHP 5 are used. Otherwise the behaviour is
58 the old.
59 Example 1 (works with no errors):
60 <?php
61 $a = new a();
62 class a {
63 }
64 ?>
65
66 Example 2 (throws an error):
67 <?php
68 $a = new a();
69 interface b{
70 }
71 class a implements b {
72 }
73 ?>
74
75 Output (example 2) :
76 Fatal error: Class 'a' not found in /tmp/cl.php on line 2
77
789. get_class() starting PHP 5 returns the name of the class as it was
79 declared which may lead to problems in older scripts that rely on
80 the previous behaviour - the class name is lowercased. Expect the
81 same behaviour from get_parent_class() when applicable.
82 Example :
83 <?php
84 class FooBar {
85 }
86 class ExtFooBar extends FooBar{}
87 $a = new FooBar();
88 var_dump(get_class($a), get_parent_class($a));
89 ?>
90
91 Output (PHP 4):
92 string(6) "foobar"
93 string(9) "extfoobar"
94
95 Output (PHP 5):
96 string(6) "FooBar"
97 string(9) "ExtFooBar"
98 ----------------------------------------------------------------------
99 Example code that will break :
100 //....
101 function someMethod($p) {
102 if (get_class($p) != 'helpingclass') {
103 return FALSE;
104 }
105 //...
106 }
107 //...
108 Possible solution is to search for get_class() and get_parent_class() in
109 all your scripts and use strtolower().
110
11110. get_class_methods() returns the names of the methods of a class as they
112 declared. In PHP4 the names are all lowercased.
113 Example code :
114 <?php
115 class Foo{
116 function doFoo(){}
117 function hasFoo(){}
118 }
119 var_dump(get_class_methods("Foo"));
120 ?>
121 Output (PHP4):
122 array(2) {
123 [0]=>
124 string(5) "dofoo"
125 [1]=>
126 string(6) "hasfoo"
127 }
128 Output (PHP5):
129 array(2) {
130 [0]=>
131 string(5) "doFoo"
132 [1]=>
133 string(6) "hasFoo"
134 }
135
13611. Assignment $this is impossible. Starting PHP 5.0.0 $this has special
137 meaning in class methods and is recognized by the PHP parser. The latter
138 will generate a parse error when assignment to $this is found
139 Example code :
140 <?php
141 class Foo {
142 function assignNew($obj) {
143 $this = $obj;
144 }
145 }
146 $a = new Foo();
147 $b = new Foo();
148 $a->assignNew($b);
149 echo "I was executed\n";
150 ?>
151 Output (PHP 4):
152 I was executed
153 Output (PHP 5):
154 PHP Fatal error: Cannot re-assign $this in /tmp/this_ex.php on line 4
155
156
README.QNX
1QNX4 Installation Notes
2-----------------------
3
4NOTE: General installation instructions are in the INSTALL file
5
6
71. To compile and test PHP3 you have to grab, compile and install:
8 - GNU dbm library or another db library;
9 - GNU bison (1.25 or later; 1.25 tested);
10 - GNU flex (any version supporting -o and -P options; 2.5.4 tested);
11 - GNU diffutils (any version supporting -w option; 2.7 tested);
12
132. To use CVS version you may need also:
14 - GNU CVS (1.9 tested);
15 - GNU autoconf (2.12 tested);
16 - GNU m4 (1.3 or later preferable; 1.4 tested);
17
183. To run configure define -lunix in command line:
19 LDFLAGS=-lunix ./configure
20
214. To use Sybase SQL Anywhere define ODBC_QNX and CUSTOM_ODBC_LIBS in
22 command line and run configure with --with-custom-odbc:
23 CFLAGS=-DODBC_QNX LDFLAGS=-lunix CUSTOM_ODBC_LIBS="-ldblib -lodbc" ./configure --with-custom-odbc=/usr/lib/sqlany50
24 If you have SQL Anywhere version 5.5.00, then you have to add
25 CFLAGS=-DSQLANY_BUG
26 to workaround its SQLFreeEnv() bug. Other versions has not been tested,
27 so try without this flag first.
28
295. To build the Apache module, you may have to hardcode an include path for
30 alloc.h in your Apache base directory:
31 - APACHE_DIRECTORY/src/httpd.h:
32 change #include "alloc.h"
33 to #include "APACHE_DIRECTORY/src/alloc.h"
34 Unless you want to use system regex library, you have to hardcode also
35 a path to regex.h:
36 - APACHE_DIRECTORY/src/conf.h:
37 change #include <regex.h>
38 to #include "APACHE_DIRECTORY/src/regex/regex.h"
39 I don't know so far why this required for QNX, may be it is Watcom
40 compiler problem.
41
42 If you building Apache module with SQL Anywhere support, you'll get
43 symbol conflict with BOOL. It is defined in Apache (httpd.h) and in
44 SQL Anywhere (odbc.h). This has nothing to do with PHP, so you have to
45 fix it yourself someway.
46
476. With above precautions, it should compile as is and pass regression
48 tests completely:
49 make
50 make check
51 make install
52
53 Don't bother me unless you really sure you made that all but it
54 still doesn't work.
55
56June 28, 1998
57Igor Kovalenko -- owl@infomarket.ru
58
README.REDIST.BINS
11. libmagic (ext/fileinfo) see ext/fileinfo/libmagic/LICENSE
22. Oniguruma (ext/mbstring) see ext/mbstring/oniguruma/COPYING
33. libmbfl (ext/mbstring) see ext/mbstring/libmbfl/LICENSE
44. pcrelib (ext/pcre) see ext/pcre/pcrelib/LICENCE
55. ext/standard crypt
66. ext/standard crypt's blowfish implementation
77. Sqlite/Sqlite3 ext/sqlite3 ext/sqlite
88. ext/json/json_parser
99. ext/standard/rand
1010. ext/standard/scanf
1111. ext/standard/strnatcmp.c
1212. ext/standard/uuencode
1313. libxmlrpc ext/xmlrpc
1414. libzip ext/zip
1515. main/snprintf.c
1616. main/strlcat
1717. main/strlcpy
1818. libgd see ext/gd/libgd/COPYING
19
205. ext/standard crypt
21
22FreeSec: libcrypt for NetBSD
23
24Copyright (c) 1994 David Burren
25All rights reserved.
26
27Redistribution and use in source and binary forms, with or without
28modification, are permitted provided that the following conditions
29are met:
301. Redistributions of source code must retain the above copyright
31 notice, this list of conditions and the following disclaimer.
322. Redistributions in binary form must reproduce the above copyright
33 notice, this list of conditions and the following disclaimer in the
34 documentation and/or other materials provided with the distribution.
353. Neither the name of the author nor the names of other contributors
36 may be used to endorse or promote products derived from this software
37 without specific prior written permission.
38
39THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
40ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
41IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
42ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
43FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
44DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
45OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
46HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
47LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
48OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
49SUCH DAMAGE.
50
51
526. ext/standard crypt's blowfish implementation
53
54This code comes from John the Ripper password cracker, with reentrant
55and crypt(3) interfaces added, but optimizations specific to password
56cracking removed.
57
58Written by Solar Designer <solar at openwall.com> in 1998-2002 and
59placed in the public domain.
60
61There's absolutely no warranty.
62
63It is my intent that you should be able to use this on your system,
64as a part of a software package, or anywhere else to improve security,
65ensure compatibility, or for any other purpose. I would appreciate
66it if you give credit where it is due and keep your modifications in
67the public domain as well, but I don't require that in order to let
68you place this code and any modifications you make under a license
69of your choice.
70
71This implementation is compatible with OpenBSD bcrypt.c (version 2a)
72by Niels Provos <provos at citi.umich.edu>, and uses some of his
73ideas. The password hashing algorithm was designed by David Mazieres
74<dm at lcs.mit.edu>.
75
76There's a paper on the algorithm that explains its design decisions:
77
78http://www.usenix.org/events/usenix99/provos.html
79
80Some of the tricks in BF_ROUND might be inspired by Eric Young's
81Blowfish library (I can't be sure if I would think of something if I
82hadn't seen his code).
83
84
857. Sqlite/Sqlite3 ext/sqlite3 ext/sqlite
86
87The author disclaims copyright to this source code. In place of
88a legal notice, here is a blessing:
89 May you do good and not evil.
90 May you find forgiveness for yourself and forgive others.
91 May you share freely, never taking more than you give.
92
93
948. ext/json/json_parser
95Copyright (c) 2005 JSON.org
96
97Permission is hereby granted, free of charge, to any person obtaining a copy
98of this software and associated documentation files (the "Software"), to deal
99in the Software without restriction, including without limitation the rights
100to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
101copies of the Software, and to permit persons to whom the Software is
102furnished to do so, subject to the following conditions:
103
104The above copyright notice and this permission notice shall be included in all
105copies or substantial portions of the Software.
106
107The Software shall be used for Good, not Evil.
108
109THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
110IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
111FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
112AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
113LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
114OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
115SOFTWARE.
116
117
1189. ext/standard/rand
119The following php_mt_...() functions are based on a C++ class MTRand by
120Richard J. Wagner. For more information see the web page at
121http://www-personal.engin.umich.edu/~wagnerr/MersenneTwister.html
122
123Mersenne Twister random number generator -- a C++ class MTRand
124Based on code by Makoto Matsumoto, Takuji Nishimura, and Shawn Cokus
125Richard J. Wagner v1.0 15 May 2003 rjwagner@writeme.com
126
127The Mersenne Twister is an algorithm for generating random numbers. It
128was designed with consideration of the flaws in various other generators.
129The period, 2^19937-1, and the order of equidistribution, 623 dimensions,
130are far greater. The generator is also fast; it avoids multiplication and
131division, and it benefits from caches and pipelines. For more information
132see the inventors' web page at http://www.math.keio.ac.jp/~matumoto/emt.html
133
134Reference
135M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-Dimensionally
136Equidistributed Uniform Pseudo-Random Number Generator", ACM Transactions on
137Modeling and Computer Simulation, Vol. 8, No. 1, January 1998, pp 3-30.
138
139Copyright (C) 1997 - 2002, Makoto Matsumoto and Takuji Nishimura,
140Copyright (C) 2000 - 2003, Richard J. Wagner
141All rights reserved.
142
143Redistribution and use in source and binary forms, with or without
144modification, are permitted provided that the following conditions
145are met:
146
1471. Redistributions of source code must retain the above copyright
148 notice, this list of conditions and the following disclaimer.
149
1502. Redistributions in binary form must reproduce the above copyright
151 notice, this list of conditions and the following disclaimer in the
152 documentation and/or other materials provided with the distribution.
153
1543. The names of its contributors may not be used to endorse or promote
155 products derived from this software without specific prior written
156 permission.
157
158THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
159"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
160LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
161A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
162CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
163EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
164PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
165PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
166LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
167NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
168SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
169
170
17110. ext/standard/scanf
172scanf.c --
173
174This file contains the base code which implements sscanf and by extension
175fscanf. Original code is from TCL8.3.0 and bears the following copyright:
176
177This software is copyrighted by the Regents of the University of
178California, Sun Microsystems, Inc., Scriptics Corporation,
179and other parties. The following terms apply to all files associated
180with the software unless explicitly disclaimed in individual files.
181
182The authors hereby grant permission to use, copy, modify, distribute,
183and license this software and its documentation for any purpose, provided
184that existing copyright notices are retained in all copies and that this
185notice is included verbatim in any distributions. No written agreement,
186license, or royalty fee is required for any of the authorized uses.
187Modifications to this software may be copyrighted by their authors
188and need not follow the licensing terms described here, provided that
189the new terms are clearly indicated on the first page of each file where
190they apply.
191
192IN NO EVENT SHALL THE AUTHORS OR DISTRIBUTORS BE LIABLE TO ANY PARTY
193FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES
194ARISING OUT OF THE USE OF THIS SOFTWARE, ITS DOCUMENTATION, OR ANY
195DERIVATIVES THEREOF, EVEN IF THE AUTHORS HAVE BEEN ADVISED OF THE
196POSSIBILITY OF SUCH DAMAGE.
197
198THE AUTHORS AND DISTRIBUTORS SPECIFICALLY DISCLAIM ANY WARRANTIES,
199INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY,
200FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT. THIS SOFTWARE
201IS PROVIDED ON AN "AS IS" BASIS, AND THE AUTHORS AND DISTRIBUTORS HAVE
202NO OBLIGATION TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR
203MODIFICATIONS.
204
205GOVERNMENT USE: If you are acquiring this software on behalf of the
206U.S. government, the Government shall have only "Restricted Rights"
207in the software and related documentation as defined in the Federal
208Acquisition Regulations (FARs) in Clause 52.227.19 (c) (2). If you
209are acquiring the software on behalf of the Department of Defense, the
210software shall be classified as "Commercial Computer Software" and the
211Government shall have only "Restricted Rights" as defined in Clause
212252.227-7013 (c) (1) of DFARs. Notwithstanding the foregoing, the
213authors grant the U.S. Government and others acting in its behalf
214permission to use and distribute the software in accordance with the
215terms specified in this license.
216
21711. ext/standard/strnatcmp.c
218
219strnatcmp.c -- Perform 'natural order' comparisons of strings in C.
220Copyright (C) 2000 by Martin Pool <mbp@humbug.org.au>
221
222This software is provided 'as-is', without any express or implied
223warranty. In no event will the authors be held liable for any damages
224arising from the use of this software.
225
226Permission is granted to anyone to use this software for any purpose,
227including commercial applications, and to alter it and redistribute it
228freely, subject to the following restrictions:
229
2301. The origin of this software must not be misrepresented; you must not
231 claim that you wrote the original software. If you use this software
232 in a product, an acknowledgment in the product documentation would be
233 appreciated but is not required.
2342. Altered source versions must be plainly marked as such, and must not be
235 misrepresented as being the original software.
2363. This notice may not be removed or altered from any source distribution.
237
23812. ext/standard/uuencode
239Portions of this code are based on Berkeley's uuencode/uudecode
240implementation.
241
242Copyright (c) 1983, 1993
243The Regents of the University of California. All rights reserved.
244
245Redistribution and use in source and binary forms, with or without
246modification, are permitted provided that the following conditions
247are met:
2481. Redistributions of source code must retain the above copyright
249 notice, this list of conditions and the following disclaimer.
2502. Redistributions in binary form must reproduce the above copyright
251 notice, this list of conditions and the following disclaimer in the
252 documentation and/or other materials provided with the distribution.
2533. All advertising materials mentioning features or use of this software
254 must display the following acknowledgement:
255This product includes software developed by the University of
256California, Berkeley and its contributors.
2574. Neither the name of the University nor the names of its contributors
258 may be used to endorse or promote products derived from this software
259 without specific prior written permission.
260
261THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
262ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
263IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
264ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
265FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
266DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
267OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
268HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
269LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
270OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
271SUCH DAMAGE.
272
273
27413. libxmlrpc ext/xmlrpc
275
276Copyright 2000 Epinions, Inc.
277
278Subject to the following 3 conditions, Epinions, Inc. permits you, free
279of charge, to (a) use, copy, distribute, modify, perform and display this
280software and associated documentation files (the "Software"), and (b)
281permit others to whom the Software is furnished to do so as well.
282
2831) The above copyright notice and this permission notice shall be included
284without modification in all copies or substantial portions of the
285Software.
286
2872) THE SOFTWARE IS PROVIDED "AS IS", WITHOUT ANY WARRANTY OR CONDITION OF
288ANY KIND, EXPRESS, IMPLIED OR STATUTORY, INCLUDING WITHOUT LIMITATION ANY
289IMPLIED WARRANTIES OF ACCURACY, MERCHANTABILITY, FITNESS FOR A PARTICULAR
290PURPOSE OR NONINFRINGEMENT.
291
2923) IN NO EVENT SHALL EPINIONS, INC. BE LIABLE FOR ANY DIRECT, INDIRECT,
293SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES OR LOST PROFITS ARISING OUT
294OF OR IN CONNECTION WITH THE SOFTWARE (HOWEVER ARISING, INCLUDING
295NEGLIGENCE), EVEN IF EPINIONS, INC. IS AWARE OF THE POSSIBILITY OF SUCH
296DAMAGES.
297
29814. libzip ext/zip
299zip.h -- exported declarations.
300Copyright (C) 1999-2009 Dieter Baron and Thomas Klausner
301
302This file is part of libzip, a library to manipulate ZIP archives.
303The authors can be contacted at <libzip@nih.at>
304
305Redistribution and use in source and binary forms, with or without
306modification, are permitted provided that the following conditions
307are met:
3081. Redistributions of source code must retain the above copyright
309 notice, this list of conditions and the following disclaimer.
3102. Redistributions in binary form must reproduce the above copyright
311 notice, this list of conditions and the following disclaimer in
312 the documentation and/or other materials provided with the
313 distribution.
3143. The names of the authors may not be used to endorse or promote
315 products derived from this software without specific prior
316 written permission.
317
318THIS SOFTWARE IS PROVIDED BY THE AUTHORS ``AS IS'' AND ANY EXPRESS
319OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
320WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
321ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY
322DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
323DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
324GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
325INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
326IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
327OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
328IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
329
33015. main/snprintf.c
331Copyright (c) 2002, 2006 Todd C. Miller <Todd.Miller@courtesan.com>
332
333Permission to use, copy, modify, and distribute this software for any
334purpose with or without fee is hereby granted, provided that the above
335copyright notice and this permission notice appear in all copies.
336
337THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
338WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
339MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
340ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
341WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
342ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
343OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
344
345Sponsored in part by the Defense Advanced Research Projects
346Agency (DARPA) and Air Force Research Laboratory, Air Force
347Materiel Command, USAF, under agreement number F39502-99-1-0512.
348
349main/spprintf
350Copyright (c) 1995-1998 The Apache Group. All rights reserved.
351
352Redistribution and use in source and binary forms, with or without
353modification, are permitted provided that the following conditions
354are met:
355
3561. Redistributions of source code must retain the above copyright
357 notice, this list of conditions and the following disclaimer.
358
3592. Redistributions in binary form must reproduce the above copyright
360 notice, this list of conditions and the following disclaimer in
361 the documentation and/or other materials provided with the
362 distribution.
363
3643. All advertising materials mentioning features or use of this
365 software must display the following acknowledgment:
366 "This product includes software developed by the Apache Group
367 for use in the Apache HTTP server project (http://www.apache.org/)."
368
3694. The names "Apache Server" and "Apache Group" must not be used to
370 endorse or promote products derived from this software without
371 prior written permission.
372
3735. Redistributions of any form whatsoever must retain the following
374 acknowledgment:
375 "This product includes software developed by the Apache Group
376 for use in the Apache HTTP server project (http://www.apache.org/)."
377
378THIS SOFTWARE IS PROVIDED BY THE APACHE GROUP ``AS IS'' AND ANY
379EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
380IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
381PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE APACHE GROUP OR
382ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
383SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
384NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
385LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
386HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
387STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
388ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
389OF THE POSSIBILITY OF SUCH DAMAGE.
390====================================================================
391
392This software consists of voluntary contributions made by many
393individuals on behalf of the Apache Group and was originally based
394on public domain software written at the National Center for
395Supercomputing Applications, University of Illinois, Urbana-Champaign.
396For more information on the Apache Group and the Apache HTTP server
397project, please see <http://www.apache.org/>.
398
399This code is based on, and used with the permission of, the
400SIO stdio-replacement strx_* functions by Panos Tsirigotis
401<panos@alumni.cs.colorado.edu> for xinetd.
402
40316. main/strlcat
40417. main/strlcpy
405Copyright (c) 1998 Todd C. Miller <Todd.Miller@courtesan.com>
406All rights reserved.
407
408Redistribution and use in source and binary forms, with or without
409modification, are permitted provided that the following conditions
410are met:
4111. Redistributions of source code must retain the above copyright
412 notice, this list of conditions and the following disclaimer.
4132. Redistributions in binary form must reproduce the above copyright
414 notice, this list of conditions and the following disclaimer in the
415 documentation and/or other materials provided with the distribution.
4163. The name of the author may not be used to endorse or promote products
417 derived from this software without specific prior written permission.
418
419THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES,
420INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
421AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
422THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
423EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
424PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
425OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
426WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
427OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
428ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
429
430
README.RELEASE_PROCESS
1=======================
2 PHP Release Process
3=======================
4
5General notes and tips
6----------------------
7
81. Do not release on Fridays, Saturdays or Sundays
9because the sysadmins can not upgrade stuff then.
10
112. Package the day before a release. So if the release is to be on Thursday,
12package on Wednesday.
13
143. Ensure that Windows builds will work before packaging
15
164. Follow all steps to the letter. When unclear ask previous RM's (Derick/Ilia)
17before proceeding. Ideally make sure that for the first releases one of the
18previous RM's is around to answer questions. For the steps related to the
19php/QA/bug websites try to have someone from the webmaster team (Bjori) on hand.
20
215. Verify the tags to be extra sure everything was tagged properly.
22
236. Moving extensions from/to PECL requires root level access to the CVS server.
24Contact systems@php.net to get this taken care of.
25
26Moving extensions from php-src to PECL
27- Filesystem: cp -r php-src/ext/foo pecl/foo
28- cvs rm php-src/ext/foo
29
30If the extension is still usable or not dead, in cooperation with the extension
31maintainers if any:
32- create the pecl.php.net/foo package and its content, license, maintainer
33- create the package.xml, commit
34- release the package
35
36Moving extensions from PECL to php-src
37- Filesystem: cp -r pecl/foo php-src/ext/foo
38OR depending on the wishes from the PECL extension maintainer.
39- Filesystem: ln -s pecl/foo php-src/ext/foo
40
41Rolling a non stable release (alpha/beta/RC)
42--------------------------------------------
43
441. Check windows snapshot builder logs (http://snaps.php.net/win32/snapshot-STABLE.log f.e.)
45
462. Bump the version numbers in ``main/php_version.h``, ``configure.in`` and possibly ``NEWS``.
47Do not use abbreviations for alpha and beta.
48
493. Commit those changes
50
514. run the "scripts/dev/credits" script in php-src and commit the changes in the
52credits files in ext/standard.
53
545. tag the repository with the version f.e. "``cvs tag php_4_4_1RC1``"
55(of course, you need to change that to the version you're rolling an RC for).
56
576. Bump up the version numbers in ``main/php_version.h``, ``configure.in``
58and possibly ``NEWS`` again, to the **next** version. F.e. if the release
59candidate was "4.4.1RC1" then the new one should be "4.4.1RC2-dev" - regardless
60if we get a new RC or not. This is to make sure ``version_compare()`` can
61correctly work.
62
637. Commit those changes
64
658. Log in onto the snaps box and go into the correct tree (f.e. the PHP_4_4
66branch if you're rolling 4.4.x releases).
67
689. You do not have to update the tree, but of course you can with "``cvs up -dP``".
69
7010. run: ``./makedist php 4.4.1RC1``, this will export the tree, create configure
71and build two tarballs (one gz and one bz2).
72
7311. Copy those two tarballs to www.php.net, in your homedir there should be a
74directory "downloads/". Copy them into there, so that the system can generate
75MD5 sums. If you do not have this directory, talk to Derick.
76
7712. Now the RC can be found on http://downloads.php.net/yourname,
78f.e. http://downloads.php.net/derick/
79
8013. Once the release has been tagged, contact the PHP Windows development team
81(internals-win@lists.php.net) so that Windows binaries can be created. Once
82those are made, they should be placed into the same directory as the source snapshots.
83
84Getting the non stable release (alpha/beta/RC) announced
85--------------------------------------------------------
86
871. Send an email (see example here: http://news.php.net/php.internals/19486)
88**To** ``internals@lists.php.net`` and ``php-general@lists.php.net`` lists
89pointing out "the location of the release" and "the possible release date of
90either the next RC, or the final release".
91
922. Send an email (see example here http://news.php.net/php.pear.qa/5201) **To**
93``php-qa@lists.php.net`` and ``primary-qa-tests@lists.php.net``.
94This email is to notify the selected projects about a new release so that they
95can make sure their projects keep working. Make sure that you have been setup
96as a moderator for ``primary-qa-tests@lists.php.net`` by having someone (Wez,
97Derick) run the following commands for you:
98
99``ssh lists.php.net``
100
101``sudo -u ezmlm ezmlm-sub ~ezmlm/primary-qa-tester/mod moderator-email-address``
102
1033. Update the MD5 sums in ``web/qa/trunk/include/rc-md5sums.txt`` (no empty lines).
104
1054. Update in ``web/qa/trunk/include/release-qa.php`` constants with the new RC and
106commit this.
107
108 a. ``$BUILD_TEST_RELEASES`` = array("4.4.7RC1", "5.2.2RC1")
109
110 b. ``$CURRENT_QA_RELEASE_4`` = "4.4.7RC1" (``$CURRENT_QA_RELEASE_5`` for PHP5)
111
112 c. ``$RELEASE_PROCESS`` = array(4 => true, 5 => true)
113
1141. Update in ``php-bugs/trunk/include/functions.inc`` the ``show_version_option``
115function to include the new RC and commit.
116
1172. Update ``phpweb/include/version.inc`` (x=major version number)
118
119 a. ``$PHP_x_RC`` = "5.3.0RC1"
120
121 b. ``$PHP_x_RC_DATE`` = "06 September 2007"
122
1233. Commit those changes:
124
125 a. ``cvs commit include/version.inc include/releases.inc``
126
1274. For the first RC, write the doc team (phpdoc@lists.php.net) about updating the
128INSTALL and win32/install.txt files which are generated from the PHP manual sources.
129
130Rolling a stable release
131------------------------
132
1331. Check windows snapshot builder logs (http://snaps.php.net/win32/snapshot-STABLE.log f.e.)
134
1352. Bump the version numbers in ``main/php_version.h``, ``configure.in`` and possibly ``NEWS``.
136
1373. **Merge** all related sections in NEWS (f.e. merge the 4.4.1RC1 and 4.4.0 sections)
138
1394. Commit those changes
140
1415. run the "scripts/dev/credits" script in php-src and commit the changes in the
142credits files in ext/standard.
143
1446. tag the repository with the version f.e. "``cvs tag php_4_4_1``"
145(of course, you need to change that to the version you're rolling an RC for).
146When making 5.X release, you need to tag the Zend directory separately!!
147
1487. Bump up the version numbers in ``main/php_version.h``, ``configure.in`` and
149possibly ``NEWS`` again, to the **next** version. F.e. if the release candidate
150was "4.4.1RC1" then the new one should be "4.4.1RC2-dev" - regardless if we get
151a new RC or not. This is to make sure ``version_compare()`` can correctly work.
152
1538. Commit those changes
154
1559. Log in onto the snaps box and go into the correct tree (f.e. the PHP_4_4
156branch if you're rolling 4.4.x releases).
157
15810. You do not have to update the tree, but of course you can with "``cvs up -dP``".
159
16011. run: ``./makedist php 4.4.1``, this will export the tree, create configure
161and build two tarballs (one gz and one bz2).
162
16312. Commit those two tarballs to CVS (phpweb/distributions)
164
16513. Once the release has been tagged, contact the PHP Windows development team
166(internals-win@lists.php.net) so that Windows binaries can be created. Once
167those are made, they should be committed to CVS too.
168
16914. Check if the pear files are updated (phar for 5.1+ or run pear/make-pear-bundle.php with 4.4)
170
17115. When making a final release, also remind the PHP Windows development team
172(internals-win@lists.php.net) to prepare the installer packages for Win32.
173
174Getting the stable release announced
175------------------------------------
176
1771. Run the bumpRelease script for phpweb on your local checkout
178
179 a. ``php bin/bumpRelease 5`` (or ``php bin/bumpRelease 4`` for PHP4)
180
1812. Edit ``phpweb/include/version.inc`` and change (X=major release number):
182
183 a. ``$PHP_X_VERSION`` to the correct version
184
185 b. ``$PHP_X_DATE`` to the release date
186
187 c. ``$PHP_X_MD5`` array and update all the md5 sums
188
189 d. set ``$PHP_X_RC`` to false!
190
191 e. Make sure there are no outdated "notes" or edited "date" keys in the
192 ``$RELEASES[X][$PHP_X_VERSION]["source"]`` array
193
194 f. if the windows builds aren't ready yet prefix the "windows" key with a dot (".windows")
195
1963. Update the ChangeLog file for the given major version
197f.e. ``ChangeLog-4.php`` from the NEWS file
198
199 a. go over the list and put every element on one line
200
201 b. check for &, < and > and escape them if necessary
202
203 c. remove all the names at the ends of lines
204
205 d. for marking up, you can do the following (with VI):
206
207 I. ``s/^- /<li>/``
208
209 II. ``s/$/<\/li>/``
210
211 III. ``s/Fixed bug #\([0-9]\+\)/<?php bugfix(\1); ?>/``
212
213 IV. ``s/Fixed PECL bug #\([0-9]\+\)/<?php peclbugfix(\1); ?>/``
214
215 V. ``s/FR #\([0-9]\+\)/FR <?php bugl(\1); ?>/``
216
2174. ``cp releases/4_4_0.php releases/4_4_1.php``
218
2195. ``cvs add releases/4_4_1.php``
220
2216. Update the ``releases/*.php`` file with relevant data. The release
222announcement file should list in detail:
223
224 a. security fixes,
225
226 b. changes in behavior (whether due to a bug fix or not)
227
2287. Add a short notice to phpweb stating that there is a new release, and
229highlight the major important things (security fixes) and when it is important
230to upgrade.
231
232 a. Call php bin/createNewsEntry in your local phpweb checkout
233
234 b. Add the content for the news entry
235
2368. Commit all the changes.
237
2389. Wait an hour or two, then send a mail to php-announce@lists.php.net,
239php-general@lists.php.net and internals@lists.php.net with a text similar to
240http://news.php.net/php.internals/17222.
241
24210. Update ``php-bugs-web/include/functions.php`` to include the new version
243number, and remove the RC from there.
244
24511. Update ``qaweb/include/release-qa.php``
246
247 a. Update the $BUILD_TEST_RELEASES array with the release name
248
249 b. Update $RELEASE_PROCESS array (set to false)
250
251 I. For PHP4: Set $CURRENT_QA_RELEASE_4 to false
252
253 II. For PHP5: Set $CURRENT_QA_RELEASE_5 to false
254
255Re-releasing the same version (or -pl)
256--------------------------------------
257
2581. Commit the new binaries to ``phpweb/distributions/``
259
2602. Edit ``phpweb/include/version.inc`` and change (X=major release number):
261
262 a. If only releasing for one OS, make sure you edit only those variables
263
264 b. ``$PHP_X_VERSION`` to the correct version
265
266 c. ``$PHP_X_DATE`` to the release date
267
268 d. ``$PHP_X_MD5`` array and update all the md5 sums
269
270 e. Make sure there are no outdated "notes" or edited "date" keys in the
271 ``$RELEASES[X][$PHP_X_VERSION]["source"]`` array
272
2733. Add a short notice to phpweb stating that there is a new release, and
274highlight the major important things (security fixes) and when it is important
275to upgrade.
276
277 a. Call php bin/createNewsEntry in your local phpweb checkout
278
279 b. Add the content for the news entry
280
2814. Commit all the changes (``include/version.inc``, ``archive/archive.xml``,
282``archive/entries/YYYY-MM-DD-N.xml``)
283
2845. Wait an hour or two, then send a mail to php-announce@lists.php.net,
285php-general@lists.php.net and internals@lists.php.net with a text similar to
286the news entry.
287
README.SELF-CONTAINED-EXTENSIONS
1$Id$
2=============================================================================
3
4HOW TO CREATE A SELF-CONTAINED PHP EXTENSION
5
6 A self-contained extension can be distributed independently of
7 the PHP source. To create such an extension, two things are
8 required:
9
10 - Configuration file (config.m4)
11 - Source code for your module
12
13 We will describe now how to create these and how to put things
14 together.
15
16PREPARING YOUR SYSTEM
17
18 While the result will run on any system, a developer's setup needs these
19 tools:
20
21 GNU autoconf
22 GNU automake
23 GNU libtool
24 GNU m4
25
26 All of these are available from
27
28 ftp://ftp.gnu.org/pub/gnu/
29
30CONVERTING AN EXISTING EXTENSION
31
32 Just to show you how easy it is to create a self-contained
33 extension, we will convert an embedded extension into a
34 self-contained one. Install PHP and execute the following
35 commands.
36
37 $ mkdir /tmp/newext
38 $ cd /tmp/newext
39
40 You now have an empty directory. We will copy the files from
41 the mysql extension:
42
43 $ cp -rp php-4.0.X/ext/mysql/* .
44
45 It is time to finish the module. Run:
46
47 $ phpize
48
49 You can now ship the contents of the directory - the extension
50 can live completely on its own.
51
52 The user instructions boil down to
53
54 $ ./configure \
55 [--with-php-config=/path/to/php-config] \
56 [--with-mysql=MYSQL-DIR]
57 $ make install
58
59 The MySQL module will either use the embedded MySQL client
60 library or the MySQL installation in MYSQL-DIR.
61
62
63DEFINING THE NEW EXTENSION
64
65 Our demo extension is called "foobar".
66
67 It consists of two source files "foo.c" and "bar.c"
68 (and any arbitrary amount of header files, but that is not
69 important here).
70
71 The demo extension does not reference any external
72 libraries (that is important, because the user does not
73 need to specify anything).
74
75
76 LTLIBRARY_SOURCES specifies the names of the sources files. You can
77 name an arbitrary number of source files here.
78
79CREATING THE M4 CONFIGURATION FILE
80
81 The m4 configuration can perform additional checks. For a
82 self-contained extension, you do not need more than a few
83 macro calls.
84
85------------------------------------------------------------------------------
86PHP_ARG_ENABLE(foobar,whether to enable foobar,
87[ --enable-foobar Enable foobar])
88
89if test "$PHP_FOOBAR" != "no"; then
90 PHP_NEW_EXTENSION(foobar, foo.c bar.c, $ext_shared)
91fi
92------------------------------------------------------------------------------
93
94 PHP_ARG_ENABLE will automatically set the correct variables, so
95 that the extension will be enabled by PHP_NEW_EXTENSION in shared mode.
96
97 The first argument of PHP_NEW_EXTENSION describes the name of the
98 extension. The second names the source-code files. The third passes
99 $ext_shared which is set by PHP_ARG_ENABLE/WITH to PHP_NEW_EXTENSION.
100
101 Please use always PHP_ARG_ENABLE or PHP_ARG_WITH. Even if you do not
102 plan to distribute your module with PHP, these facilities allow you
103 to integrate your module easily into the main PHP module framework.
104
105CREATING SOURCE FILES
106
107 ext_skel can be of great help when creating the common code for all modules
108 in PHP for you and also writing basic function definitions and C code for
109 handling arguments passed to your functions. See README.EXT_SKEL for further
110 information.
111
112 As for the rest, you are currently alone here. There are a lot of existing
113 modules, use a simple module as a starting point and add your own code.
114
115
116CREATING THE SELF-CONTAINED EXTENSION
117
118 Put config.m4 and the source files into one directory. Then, run phpize
119 (this is installed during make install by PHP 4.0).
120
121 For example, if you configured PHP with --prefix=/php, you would run
122
123 $ /php/bin/phpize
124
125 This will automatically copy the necessary build files and create
126 configure from your config.m4.
127
128 And that's it. You now have a self-contained extension.
129
130INSTALLING A SELF-CONTAINED EXTENSION
131
132 An extension can be installed by running:
133
134 $ ./configure \
135 [--with-php-config=/path/to/php-config]
136 $ make install
137
138ADDING SHARED MODULE SUPPORT TO A MODULE
139
140 In order to be useful, a self-contained extension must be loadable
141 as a shared module. I will explain now how you can add shared module
142 support to an existing module called foo.
143
144 1. In config.m4, use PHP_ARG_WITH/PHP_ARG_ENABLE. Then you will
145 automatically be able to use --with-foo=shared[,..] or
146 --enable-foo=shared[,..].
147
148 2. In config.m4, use PHP_NEW_EXTENSION(foo,.., $ext_shared) to enable
149 building the extension.
150
151 3. Add the following lines to your C source file:
152
153 #ifdef COMPILE_DL_FOO
154 ZEND_GET_MODULE(foo)
155 #endif
156
README.STREAMS
1An Overview of the PHP Streams abstraction
2==========================================
3$Id$
4
5WARNING: some prototypes in this file are out of date.
6The information contained here is being integrated into
7the PHP manual - stay tuned...
8
9Please send comments to: Wez Furlong <wez@thebrainroom.com>
10
11Why Streams?
12============
13You may have noticed a shed-load of issock parameters flying around the PHP
14code; we don't want them - they are ugly and cumbersome and force you to
15special case sockets and files every time you need to work with a "user-level"
16PHP file pointer.
17Streams take care of that and present the PHP extension coder with an ANSI
18stdio-alike API that looks much nicer and can be extended to support non file
19based data sources.
20
21Using Streams
22=============
23Streams use a php_stream* parameter just as ANSI stdio (fread etc.) use a
24FILE* parameter.
25
26The main functions are:
27
28PHPAPI size_t php_stream_read(php_stream * stream, char * buf, size_t count);
29PHPAPI size_t php_stream_write(php_stream * stream, const char * buf, size_t
30 count);
31PHPAPI size_t php_stream_printf(php_stream * stream TSRMLS_DC,
32 const char * fmt, ...);
33PHPAPI int php_stream_eof(php_stream * stream);
34PHPAPI int php_stream_getc(php_stream * stream);
35PHPAPI char *php_stream_gets(php_stream * stream, char *buf, size_t maxlen);
36PHPAPI int php_stream_close(php_stream * stream);
37PHPAPI int php_stream_flush(php_stream * stream);
38PHPAPI int php_stream_seek(php_stream * stream, off_t offset, int whence);
39PHPAPI off_t php_stream_tell(php_stream * stream);
40PHPAPI int php_stream_lock(php_stream * stream, int mode);
41
42These (should) behave in the same way as the ANSI stdio functions with similar
43names: fread, fwrite, fprintf, feof, fgetc, fgets, fclose, fflush, fseek, ftell, flock.
44
45Opening Streams
46===============
47In most cases, you should use this API:
48
49PHPAPI php_stream *php_stream_open_wrapper(char *path, char *mode,
50 int options, char **opened_path TSRMLS_DC);
51
52Where:
53 path is the file or resource to open.
54 mode is the stdio compatible mode eg: "wb", "rb" etc.
55 options is a combination of the following values:
56 IGNORE_PATH (default) - don't use include path to search for the file
57 USE_PATH - use include path to search for the file
58 IGNORE_URL - do not use plugin wrappers
59 REPORT_ERRORS - show errors in a standard format if something
60 goes wrong.
61 STREAM_MUST_SEEK - If you really need to be able to seek the stream
62 and don't need to be able to write to the original
63 file/URL, use this option to arrange for the stream
64 to be copied (if needed) into a stream that can
65 be seek()ed.
66
67 opened_path is used to return the path of the actual file opened,
68 but if you used STREAM_MUST_SEEK, may not be valid. You are
69 responsible for efree()ing opened_path. opened_path may be (and usually
70 is) NULL.
71
72If you need to open a specific stream, or convert standard resources into
73streams there are a range of functions to do this defined in php_streams.h.
74A brief list of the most commonly used functions:
75
76PHPAPI php_stream *php_stream_fopen_from_file(FILE *file, const char *mode);
77 Convert a FILE * into a stream.
78
79PHPAPI php_stream *php_stream_fopen_tmpfile(void);
80 Open a FILE * with tmpfile() and convert into a stream.
81
82PHPAPI php_stream *php_stream_fopen_temporary_file(const char *dir,
83 const char *pfx, char **opened_path TSRMLS_DC);
84 Generate a temporary file name and open it.
85
86There are some network enabled relatives in php_network.h:
87
88PHPAPI php_stream *php_stream_sock_open_from_socket(int socket, int persistent);
89 Convert a socket into a stream.
90
91PHPAPI php_stream *php_stream_sock_open_host(const char *host, unsigned short port,
92 int socktype, int timeout, int persistent);
93 Open a connection to a host and return a stream.
94
95PHPAPI php_stream *php_stream_sock_open_unix(const char *path, int persistent,
96 struct timeval *timeout);
97 Open a UNIX domain socket.
98
99
100Stream Utilities
101================
102
103If you need to copy some data from one stream to another, you will be please
104to know that the streams API provides a standard way to do this:
105
106PHPAPI size_t php_stream_copy_to_stream(php_stream *src,
107 php_stream *dest, size_t maxlen);
108
109If you want to copy all remaining data from the src stream, pass
110PHP_STREAM_COPY_ALL as the maxlen parameter, otherwise maxlen indicates the
111number of bytes to copy.
112This function will try to use mmap where available to make the copying more
113efficient.
114
115If you want to read the contents of a stream into an allocated memory buffer,
116you should use:
117
118PHPAPI size_t php_stream_copy_to_mem(php_stream *src, char **buf,
119 size_t maxlen, int persistent);
120
121This function will set buf to the address of the buffer that it allocated,
122which will be maxlen bytes in length, or will be the entire length of the
123data remaining on the stream if you set maxlen to PHP_STREAM_COPY_ALL.
124The buffer is allocated using pemalloc(); you need to call pefree() to
125release the memory when you are done.
126As with copy_to_stream, this function will try use mmap where it can.
127
128If you have an existing stream and need to be able to seek() it, you
129can use this function to copy the contents into a new stream that can
130be seek()ed:
131
132PHPAPI int php_stream_make_seekable(php_stream *origstream, php_stream **newstream);
133
134It returns one of the following values:
135#define PHP_STREAM_UNCHANGED 0 /* orig stream was seekable anyway */
136#define PHP_STREAM_RELEASED 1 /* newstream should be used; origstream is no longer valid */
137#define PHP_STREAM_FAILED 2 /* an error occurred while attempting conversion */
138#define PHP_STREAM_CRITICAL 3 /* an error occurred; origstream is in an unknown state; you should close origstream */
139
140make_seekable will always set newstream to be the stream that is valid
141if the function succeeds.
142When you have finished, remember to close the stream.
143
144NOTE: If you only need to seek forward, there is no need to call this
145function, as the php_stream_seek can emulate forward seeking when the
146whence parameter is SEEK_CUR.
147
148NOTE: Writing to the stream may not affect the original source, so it
149only makes sense to use this for read-only use.
150
151NOTE: If the origstream is network based, this function will block
152until the whole contents have been downloaded.
153
154NOTE: Never call this function with an origstream that is referenced
155as a resource! It will close the origstream on success, and this
156can lead to a crash when the resource is later used/released.
157
158NOTE: If you are opening a stream and need it to be seekable, use the
159STREAM_MUST_SEEK option to php_stream_open_wrapper();
160
161PHPAPI int php_stream_supports_lock(php_stream * stream);
162
163This function will return either 1 (success) or 0 (failure) indicating whether or
164not a lock can be set on this stream. Typically you can only set locks on stdio streams.
165
166Casting Streams
167===============
168What if your extension needs to access the FILE* of a user level file pointer?
169You need to "cast" the stream into a FILE*, and this is how you do it:
170
171FILE * fp;
172php_stream * stream; /* already opened */
173
174if (php_stream_cast(stream, PHP_STREAM_AS_STDIO, (void*)&fp, REPORT_ERRORS) == FAILURE) {
175 RETURN_FALSE;
176}
177
178The prototype is:
179
180PHPAPI int php_stream_cast(php_stream * stream, int castas, void ** ret, int
181 show_err);
182
183The show_err parameter, if non-zero, will cause the function to display an
184appropriate error message of type E_WARNING if the cast fails.
185
186castas can be one of the following values:
187PHP_STREAM_AS_STDIO - a stdio FILE*
188PHP_STREAM_AS_FD - a generic file descriptor
189PHP_STREAM_AS_SOCKETD - a socket descriptor
190
191If you ask a socket stream for a FILE*, the abstraction will use fdopen to
192create it for you. Be warned that doing so may cause buffered data to be lost
193if you mix ANSI stdio calls on the FILE* with php stream calls on the stream.
194
195If your system has the fopencookie function, php streams can synthesize a
196FILE* on top of any stream, which is useful for SSL sockets, memory based
197streams, data base streams etc. etc.
198
199In situations where this is not desirable, you should query the stream
200to see if it naturally supports FILE *. You can use this code snippet
201for this purpose:
202
203 if (php_stream_is(stream, PHP_STREAM_IS_STDIO)) {
204 /* can safely cast to FILE* with no adverse side effects */
205 }
206
207You can use:
208
209PHPAPI int php_stream_can_cast(php_stream * stream, int castas)
210
211to find out if a stream can be cast, without actually performing the cast, so
212to check if a stream is a socket you might use:
213
214if (php_stream_can_cast(stream, PHP_STREAM_AS_SOCKETD) == SUCCESS) {
215 /* it can be a socket */
216}
217
218Please note the difference between php_stream_is and php_stream_can_cast;
219stream_is tells you if the stream is a particular type of stream, whereas
220can_cast tells you if the stream can be forced into the form you request.
221The former doesn't change anything, while the later *might* change some
222state in the stream.
223
224Stream Internals
225================
226
227There are two main structures associated with a stream - the php_stream
228itself, which holds some state information (and possibly a buffer) and a
229php_stream_ops structure, which holds the "virtual method table" for the
230underlying implementation.
231
232The php_streams ops struct consists of pointers to methods that implement
233read, write, close, flush, seek, gets and cast operations. Of these, an
234implementation need only implement write, read, close and flush. The gets
235method is intended to be used for streams if there is an underlying method
236that can efficiently behave as fgets. The ops struct also contains a label
237for the implementation that will be used when printing error messages - the
238stdio implementation has a label of "STDIO" for example.
239
240The idea is that a stream implementation defines a php_stream_ops struct, and
241associates it with a php_stream using php_stream_alloc.
242
243As an example, the php_stream_fopen() function looks like this:
244
245PHPAPI php_stream * php_stream_fopen(const char * filename, const char * mode)
246{
247 FILE * fp = fopen(filename, mode);
248 php_stream * ret;
249
250 if (fp) {
251 ret = php_stream_alloc(&php_stream_stdio_ops, fp, 0, 0, mode);
252 if (ret)
253 return ret;
254
255 fclose(fp);
256 }
257 return NULL;
258}
259
260php_stream_stdio_ops is a php_stream_ops structure that can be used to handle
261FILE* based streams.
262
263A socket based stream would use code similar to that above to create a stream
264to be passed back to fopen_wrapper (or it's yet to be implemented successor).
265
266The prototype for php_stream_alloc is this:
267
268PHPAPI php_stream * php_stream_alloc(php_stream_ops * ops, void * abstract,
269 size_t bufsize, int persistent, const char * mode)
270
271ops is a pointer to the implementation,
272abstract holds implementation specific data that is relevant to this instance
273of the stream,
274bufsize is the size of the buffer to use - if 0, then buffering at the stream
275level will be disabled (recommended for underlying sources that implement
276their own buffering - such a FILE*),
277persistent controls how the memory is to be allocated - persistently so that
278it lasts across requests, or non-persistently so that it is freed at the end
279of a request (it uses pemalloc),
280mode is the stdio-like mode of operation - php streams places no real meaning
281in the mode parameter, except that it checks for a 'w' in the string when
282attempting to write (this may change).
283
284The mode parameter is passed on to fdopen/fopencookie when the stream is cast
285into a FILE*, so it should be compatible with the mode parameter of fopen().
286
287Writing your own stream implementation
288======================================
289
290!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
291RULE #1: when writing your own streams: make sure you have configured PHP with
292--enable-debug.
293I've taken some great pains to hook into the Zend memory manager to help track
294down allocation problems. It will also help you spot incorrect use of the
295STREAMS_DC, STREAMS_CC and the semi-private STREAMS_REL_CC macros for function
296definitions.
297!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
298
299RULE #2: Please use the stdio stream as a reference; it will help you
300understand the semantics of the stream operations, and it will always
301be more up to date than these docs :-)
302
303First, you need to figure out what data you need to associate with the
304php_stream. For example, you might need a pointer to some memory for memory
305based streams, or if you were making a stream to read data from an RDBMS like
306MySQL, you might want to store the connection and rowset handles.
307
308The stream has a field called abstract that you can use to hold this data.
309If you need to store more than a single field of data, define a structure to
310hold it, allocate it (use pemalloc with the persistent flag set
311appropriately), and use the abstract pointer to refer to it.
312
313For structured state you might have this:
314
315struct my_state {
316 MYSQL conn;
317 MYSQL_RES * result;
318};
319
320struct my_state * state = pemalloc(sizeof(struct my_state), persistent);
321
322/* initialize the connection, and run a query, using the fields in state to
323 * hold the results */
324
325state->result = mysql_use_result(&state->conn);
326
327/* now allocate the stream itself */
328stream = php_stream_alloc(&my_ops, state, 0, persistent, "r");
329
330/* now stream->abstract == state */
331
332Once you have that part figured out, you can write your implementation and
333define the your own php_stream_ops struct (we called it my_ops in the above
334example).
335
336For example, for reading from this weird MySQL stream:
337
338static size_t php_mysqlop_read(php_stream * stream, char * buf, size_t count)
339{
340 struct my_state * state = (struct my_state*)stream->abstract;
341
342 if (buf == NULL && count == 0) {
343 /* in this special case, php_streams is asking if we have reached the
344 * end of file */
345 if (... at end of file ...)
346 return EOF;
347 else
348 return 0;
349 }
350
351 /* pull out some data from the stream and put it in buf */
352 ... mysql_fetch_row(state->result) ...
353 /* we could do something strange, like format the data as XML here,
354 and place that in the buf, but that brings in some complexities,
355 such as coping with a buffer size too small to hold the data,
356 so I won't even go in to how to do that here */
357}
358
359Implement the other operations - remember that write, read, close and flush
360are all mandatory. The rest are optional. Declare your stream ops struct:
361
362php_stream_ops my_ops = {
363 php_mysqlop_write, php_mysqlop_read, php_mysqlop_close,
364 php_mysqlop_flush, NULL, NULL, NULL,
365 "Strange MySQL example"
366}
367
368Thats it!
369
370Take a look at the STDIO implementation in streams.c for more information
371about how these operations work.
372The main thing to remember is that in your close operation you need to release
373and free the resources you allocated for the abstract field. In the case of
374the example above, you need to use mysql_free_result on the rowset, close the
375connection and then use pefree to dispose of the struct you allocated.
376You may read the stream->persistent field to determine if your struct was
377allocated in persistent mode or not.
378
379vim:tw=78:et
380
README.SUBMITTING_PATCH
1Submitting Patches to PHP
2=========================
3
4This document describes how to submit a patch for PHP. Creating a
5patch for PHP is easy!
6
7You don't need any login accounts or special access to download,
8build, debug and begin submitting PHP code, tests or documentation for
9inclusion in PHP. Once you've followed this README and had several
10patches accepted, PHP commit privileges are often quickly granted.
11
12An excellent article to read first is:
13http://phpadvent.org/2008/less-whining-more-coding-by-elizabeth-smith
14
15
16Prework
17-------
18If you are fixing broken functionality then create a bug or identify
19an existing bug at http://bugs.php.net/. This can be used to track
20the patch progress and prevent your changes getting lost in the PHP
21mail archives.
22
23If your code change is large, then first discuss it with the extension
24maintainer and/or a development mail list. Extension maintainers can
25be found in the EXTENSIONS file in the PHP source. Use the
26internals@lists.php.net mail list to discuss changes to the base PHP
27code. Use pecl-dev@lists.php.net for changes to code that is only
28available from PECL (http://pecl.php.net/). Use pear-dev@lists.php.net
29for PEAR modules (http://pear.php.net/). Use phpdoc@lists.php.net for
30PHP documentation questions. Mail list subscription is explained on
31http://www.php.net/mailing-lists.php.
32
33If a PHP or PECL patch affects user functionality or makes significant
34internal changes then create a simple Request For Comment (RFC) page
35on http://wiki.php.net/rfc before starting discussion. This RFC can be
36used for initial discussion and later for documentation. Wiki accounts
37can be requested on http://wiki.php.net/start?do=register
38
39Online information on PHP internal C functions is at
40http://www.php.net/internals, though this is considered
41incomplete. Various external resources can be found on the web. A
42standard reference is the book "Extending and Embedding PHP" by Sara
43Golemon.
44
45Information on contributing to PEAR is available at
46http://pear.php.net/manual/en/guide-developers.php
47
48Information on contributing to PHP documentation is at
49http://php.net/dochowto and http://wiki.php.net/doc/howto
50
51There are several IRC channels where PHP developers are often
52available to discuss questions. They include #php.pecl and #php.doc
53on the EFNet network and #php-dev-win on FreeNode.
54
55
56How to create your patch
57------------------------
58PHP uses Subversion (SVN) for revision control. Read
59http://www.php.net/svn.php for help on using SVN to get and build PHP
60source code. We recommend using a Sparse Directory checkout described
61in http://wiki.php.net/vcs/svnfaq. If you are new to SVN, read
62http://svnbook.red-bean.com.
63
64Generally we ask that patches work on the current stable PHP
65development branch and on "trunk".
66
67Read CODING_STANDARDS before you start working.
68
69After modifying the source see README.TESTING and
70http://qa.php.net/write-test.php for how to test. Submitting test
71scripts helps us to understand what functionality has changed. It is
72important for the stability and maintainability of PHP that tests are
73comprehensive.
74
75After testing is finished, create a patch file using the command:
76
77 svn diff > your_patch.txt
78
79For ease of review and later troubleshooting, submit individual
80patches for each bug or feature.
81
82
83Checklist for submitting your patch
84-----------------------------------
85 - Update SVN source just before running your final 'diff' and
86 before testing.
87 - Run "make test" to check your patch doesn't break other features.
88 - Rebuild PHP with --enable-debug (which will show some kinds of
89 memory errors) and check the PHP and web server error logs after
90 running the PHP tests.
91 - Rebuild PHP with --enable-maintainer-zts to check your patch compiles
92 on multi-threaded web servers.
93 - Create test scripts for use with "make test".
94 - Add in-line comments and/or have external documentation ready.
95 - Review the patch once more just before submitting it.
96
97
98Where to send your patch
99------------------------
100If you are patching PHP C source then email the patch to
101internals@lists.php.net
102
103If you patching a PECL extension then send the patch to
104pecl-dev@lists.php.net
105
106If you are patching PEAR then send the patch to
107pear-dev@lists.php.net
108
109If you are patching PHP's documentation then send the patch to
110phpdoc@lists.php.net
111
112The mail can be CC'd to the extension maintainer (see EXTENSIONS).
113
114Please make the subject prefix "[PATCH]", for example "[PATCH] Fix
115return value of all array functions"
116
117Include the patch as an attachment with a file extension of ".txt".
118This is because only MIME attachments of type 'text/*' are accepted.
119
120Explain what has been fixed/added/changed by your patch. Test scripts
121should be included in the email.
122
123Include the bug id(s) which can be closed by your patch.
124
125Finally, update any open bugs and add a link to the source of your
126patch.
127
128
129What happens after you submit your patch
130----------------------------------------
131If your patch is easy to review and obviously has no side-effects,
132it might be committed relatively quickly.
133
134Because PHP is a volunteer-driven effort more complex patches will
135require patience on your side. If you do not receive feedback in a few
136days, consider resubmitting the patch. Before doing this think about
137these questions:
138
139 - Did I review the mail list archives to see if these kind of
140 changes had been discussed before?
141 - Did I explain my patch clearly?
142 - Is my patch too hard to review? Because of which factors?
143 - Are there any unwanted white space changes?
144
145
146What happens when your patch is applied
147---------------------------------------
148Your name will be included in the SVN commit log. If your patch
149affects end users, a brief description and your name might be added to
150the NEWS file.
151
152Thank you for patching PHP!
153
README.SVN-RULES
1====================
2 SVN Commit Rules
3====================
4
5This is the first file you should be reading after you get your SVN account.
6We'll assume you're basically familiar with SVN, but feel free to post
7your questions on the mailing list. Please have a look at
8http://svnbook.red-bean.com/ for more detailed information on SVN.
9
10PHP is developed through the efforts of a large number of people.
11Collaboration is a Good Thing(tm), and SVN lets us do this. Thus, following
12some basic rules with regards to SVN usage will::
13
14 a. Make everybody happier, especially those responsible for maintaining
15 the SVN itself.
16
17 b. Keep the changes consistently well documented and easily trackable.
18
19 c. Prevent some of those 'Oops' moments.
20
21 d. Increase the general level of good will on planet Earth.
22
23Having said that, here are the organizational rules::
24
25 1. Respect other people working on the project.
26
27 2. Discuss any significant changes on the list before committing and get
28 confirmation from the release manager for the given branch.
29
30 3. Look at EXTENSIONS file to see who is the primary maintainer of
31 the code you want to contribute to.
32
33 4. If you "strongly disagree" about something another person did, don't
34 start fighting publicly - take it up in private email.
35
36 5. If you don't know how to do something, ask first!
37
38 6. Test your changes before committing them. We mean it. Really.
39 To do so use "make test".
40
41 7. For development use the --enable-maintainer-zts switch to ensure your
42 code handles TSRM correctly and doesn't break for thos who need that.
43
44Currently we have the following branches in use::
45
46 trunk Will become PHP 6.0. This branch is for active development.
47
48 branches/PHP_5_3 Is used to release the PHP 5.3.x series. It still allows for
49 larger enhancements.
50
51 branches/PHP_5_2 Is used to release the PHP 5.2.x series. Only bugfixes are permitted
52 on this branch (Consult the releasemaster prior to commit).
53
54 branches/PHP_5_1 This branch is closed.
55
56 branches/PHP_4_4 This branch is closed.
57
58The next few rules are more of a technical nature::
59
60 1. All changes should first go to trunk and then get merged from trunk
61 (aka MFH'ed) to all other relevant branches.
62
63 2. DO NOT TOUCH ChangeLog! It is automagically updated from the commit
64 messages every day. Woe be to those who attempt to mess with it.
65
66 3. All news updates intended for public viewing, such as new features,
67 bug fixes, improvements, etc., should go into the NEWS file of the
68 *first* to be released version with the given change. In other words
69 any NEWS file change only needs to done in one branch.
70
71 NB! Lines, starting with @ will go automagically into NEWS file, but
72 this is NOT recommended, though. Please, add news entries directly to
73 NEWS file and don't forget to keep them adjusted and sorted.
74
75 4. Do not commit multiple file and dump all messages in one commit. If you
76 modified several unrelated files, commit each group separately and
77 provide a nice commit message for each one. See example below.
78
79 5. Do write your commit message in such a way that it makes sense even
80 without the corresponding diff. One should be able to look at it, and
81 immediately know what was modified. Definitely include the function name
82 in the message as shown below.
83
84 6. In your commit messages, keep each line shorter than 80 characters. And
85 try to align your lines vertically, if they wrap. It looks bad otherwise.
86
87 7. If you modified a function that is callable from PHP, prepend PHP to
88 the function name as shown below.
89
90
91The format of the commit messages is pretty simple.
92
93Use a - to start a new item in your commit message.
94
95If a line begins with #, it is taken to be a comment and will not appear
96in the ChangeLog. Everything else goes into the ChangeLog.
97
98It is important to note that if your comment or news logline spans multiple
99lines, you have to put # at the beginning of **every** such line.
100
101Example. Say you modified two files, datetime.c and string.c. In datetime.c you
102added a new format option for the date() function, and in string.c you fixed a
103memory leak in php_trim(). Don't commit both of these at once. Commit them
104separately and try to make sure your commit messages look something like the
105following.
106
107For datetime.c::
108
109 - Added new 'K' format modifier to date() for printing out number of days
110 until New Year's Eve.
111
112For string.c::
113
114 - Fixed a memory leak in php_trim() resulting from improper use of zval_dtor().
115 #- Man, that thing was leaking all over the place!
116
117The # lines will be omitted from the ChangeLog automagically.
118
119Use the [DOC] tag in your log message whenever you feel that your changes
120imply a documentation modification. The php-doc team will automatically
121get notified about your commit through the php-doc mailing list.
122
123If you fix some bugs, you should note the bug ID numbers in your
124commit message. Bug ID should be prefixed by "#" for easier access to
125bug report when developers are browsing SVN via LXR or Bonsai.
126
127Example::
128
129 Fixed bug #14016 (pgsql notice handler double free crash bug.)
130
131If you don't see your messages in ChangeLog right away, don't worry!
132These files are updated once a day, so your stuff will not show up until
133somewhat later.
134
135When you change the NEWS file for a bug fix, then please keep the bugs
136sorted in decreasing order under the fixed version.
137
138You can use LXR (http://lxr.php.net/) and Bonsai (http://bonsai.php.net/)
139to look at PHP SVN repository in various ways.
140
141To receive daily updates to ChangeLog and NEWS, send an empty message to
142php-cvs-daily-subscribe@lists.php.net.
143
144Happy hacking,
145
146PHP Team
147
README.TESTING
1[IMPORTANT NOTICE]
2------------------
3 Failed tests usualy indicate a problem with your local system setup
4and not within PHP itself (at least for official PHP release versions).
5You may decide to automaticaly submit a test summary to our QA workflow
6at the end of a test run.
7 Please do *not* submit a failed test as a bug or ask for help on why
8it failed on your system without providing substantial backup information
9on *why* the test failed on your special setup. Thank you :-)
10
11
12[Testing Basics]
13----------------
14 The easiest way to test your PHP build is to run "make test" from the
15command line after successfully compiling. This will run the complete
16tests for all enabled functionalities and extensions using the PHP
17CLI binary.
18 To execute test scripts, you must build PHP with some SAPI, then you
19type "make test" to execute all or some test scripts saved under
20"tests" directory under source root directory.
21
22Usage:
23make test
24
25 "make test" basically executes "run-tests.php" script
26under the source root (parallel builds will not work). Therefore you
27can execute the script as follows:
28
29TEST_PHP_EXECUTABLE=sapi/cli/php \
30sapi/cli/php [-c /path/to/php.ini] run-tests.php [ext/foo/tests/GLOB]
31
32
33[Which "php" executable "make test" look for]
34---------------------------------------------
35If you are running the run-tests.php script from the command line (as above)
36you must set the TEST_PHP_EXECUTABLE environment variable to explicitly
37select the PHP executable that is to be tested, that is, used to run the test scripts.
38
39If you run the tests using make test, the PHP CLI and CGI executables are
40automatically set for you. "make test" executes "run-tests.php" script with the CLI binary. Some
41test scripts such as session must be executed by CGI SAPI. Therefore,
42you must build PHP with CGI SAPI to perform all tests.
43
44NOTE: PHP binary executing "run-tests.php" and php binary used for
45executing test scripts may differ. If you use different PHP binary for
46executing "run-tests.php" script, you may get errors.
47
48
49[Which php.ini is used]
50-----------------------
51 "make test" uses the same php.ini file as it would once installed.
52The tests have been written to be independent of that php.ini file,
53so if you find a test that is affected by a setting, please report
54this, so we can address the issue.
55
56
57[Which test scripts are executed]
58---------------------------------
59 "run-tests.php" ("make test"), without any arguments executes all
60test scripts by extracting all directories named "tests"
61from the source root and any subdirectories below. If there are files,
62which have a "phpt" extension, "run-tests.php" looks at the sections
63in these files, determines whether it should run it, by evaluating
64the 'SKIP' section. If the test is eligible for execution, the 'FILE'
65section is extracted into a ".php" file (with the same name besides
66the extension) and gets executed.
67When an argument is given or TESTS environment variable is set, the
68GLOB is expanded by the shell and any file with extension "*.phpt" is
69regarded as a test file.
70
71 Tester can easily execute tests selectively with as follows.
72
73Examples:
74./sapi/cli/php run-tests.php ext/mbstring/*
75./sapi/cli/php run-tests.php ext/mbstring/020.phpt
76
77
78[Test results]
79--------------
80 Test results are printed to standard output. If there is a failed test,
81the "run-tests.php" script saves the result, the expected result and the
82code executed to the test script directory. For example, if
83ext/myext/tests/myext.phpt fails to pass, the following files are created:
84
85ext/myext/tests/myext.php - actual test file executed
86ext/myext/tests/myext.log - log of test execution (L)
87ext/myext/tests/myext.exp - expected output (E)
88ext/myext/tests/myext.out - output from test script (O)
89ext/myext/tests/myext.diff - diff of .out and .exp (D)
90
91 Failed tests are always bugs. Either the test is bugged or not considering
92factors applying to the tester's environment, or there is a bug in PHP.
93If this is a known bug, we strive to provide bug numbers, in either the
94test name or the file name. You can check the status of such a bug, by
95going to: http://bugs.php.net/12345 where 12345 is the bug number.
96For clarity and automated processing, bug numbers are prefixed by a hash
97sign '#' in test names and/or test cases are named bug12345.phpt.
98
99NOTE: The files generated by tests can be selected by setting the
100environment variable TEST_PHP_LOG_FORMAT. For each file you want to be
101generated use the character in brackets as shown above (default is LEOD).
102The php file will be generated always.
103
104NOTE: You can set environment variable TEST_PHP_DETAILED to enable
105detailed test information.
106
107[Automated testing]
108 If you like to keep up to speed, with latest developments and quality
109assurance, setting the environment variable NO_INTERACTION to 1, will not
110prompt the tester for any user input.
111
112Normally, the exit status of "make test" is zero, regardless of the results
113of independent tests. Set the environment variable REPORT_EXIT_STATUS to 1,
114and "make test" will set the exit status ("$?") to non-zero, when an
115individual test has failed.
116
117Example script to be run by cron(1):
118========== qa-test.sh =============
119#!/bin/sh
120
121CO_DIR=$HOME/cvs/php5
122MYMAIL=qa-test@domain.com
123TMPDIR=/var/tmp
124TODAY=`date +"%Y%m%d"`
125
126# Make sure compilation enviroment is correct
127CONFIGURE_OPTS='--disable-all --enable-cli --with-pcre'
128export MAKE=gmake
129export CC=gcc
130
131# Set test environment
132export NO_INTERACTION=1
133export REPORT_EXIT_STATUS=1
134
135cd $CO_DIR
136cvs update . >>$TMPDIR/phpqatest.$TODAY
137./cvsclean ; ./buildconf ; ./configure $CONFIGURE_OPTS ; $MAKE
138$MAKE test >>$TMPDIR/phpqatest.$TODAY 2>&1
139if test $? -gt 0
140then
141 cat $TMPDIR/phpqatest.$TODAY | mail -s"PHP-QA Test Failed for $TODAY" $MYMAIL
142fi
143========== end of qa-test.sh =============
144
145NOTE: the exit status of run-tests.php will be 1 when
146REPORT_EXIT_STATUS is set. The result of "make test" may be higher
147than that. At present, gmake 3.79.1 returns 2, so it is
148advised to test for non-zero, rather then a specific value.
149
150
151[Creating new test files]
152-------------------------
153 Writing test file is very easy if you are used to PHP.
154See the HOWTO at http://qa.php.net/write-test.php
155
156
157[How to help us]
158----------------
159 If you find bug in PHP, you can submit bug report AND test script
160for us. You don't have to write complete script, just give us test
161script with following format. Please test the script and make sure
162you write the correct ACTUAL OUTPUT and EXPECTED OUTPUT before you
163submit.
164
165<?php
166/*
167Bug #12345
168substr() bug. Do not return expected string.
169
170ACTUAL OUTPUT
171XYXA
172
173EXPECTED OUTPUT
174ABCD
175*/
176
177$str = "XYZABCD";
178echo substr($str,3,7);
179
180?>
181
README.TESTING2
1[IMPORTANT NOTICE]
2------------------
3This is an addendum to README.TESTING with additional information
4specific to server-tests.php.
5
6server-tests.php is backward compatible with tests developed for
7the original run-tests.php script. server-tests is *not* used by
8'make test'. server-tests was developed to provide support for
9testing PHP under it's primary environment, HTTP, and can run the
10PHP tests under any of the SAPI modules that are direct executables,
11or are accessable via HTTP.
12
13[New features]
14----------------
15* Command line interface:
16 You can run 'php server-tests.php -h' to get all the possible options.
17* Configuration file:
18 the -c argument will allow you to use a configuration file. This is
19 handy if you are testing multiple environments and need various options
20 depending on the environment.
21 see server-tests-config.php for details.
22* CGI Emulation:
23 Will emulate a CGI environment when testing with the cgi sapi executable.
24* HTTP testing:
25 can be configured to run test scripts through an HTTP server running
26 on localhost. localhost is required since either the web server must
27 alias a directory to the php source directory, or the test scripts
28 must be copied to a directory under the web server
29 (see config options TEST_WEB_BASE_URL, TEST_BASE_PATH, and TEST_WEB_EXT)
30* New sections supported for test files (see below)
31
32When running tests over http, tests that require ini settings different that what
33the web server runs under will be skipped. Since the test harness defines a number
34of ini settings by default, the web server may require special configuration to
35make testing work.
36
37[Example Usage]
38----------------
39Some (but not all!) examples of usage:
40
411. run tests from the php source directory
42 php server-tests.php -p /path/to/php-cli
43
442. run tests using cgi emulation
45 php server-tests.php -p /path/to/php-cgi
46
473. run tests over http, copying test files into document root
48 php server-tests.php -w -u http://localhost/test -m /path/to/htdocs/test
49
504. run tests over http, php sources have been aliased in web server
51 php server-tests.php -w -u http://localhost/test
52
535. run tests using configuration file
54 php server-tests.php -c /path/to/server-tests-config.php
55
566. run tests using configuration file, but overriding some settings:
57 (config file must be first)
58 php server-tests.php -c /path/to/server-tests-config.php -w -t 3 -d /path/to/testdir
59
60NOTE: configuration as described in README.TESTING still works.
61
62[New Test Sections]
63----------------
64In addition to the traditional test sections
65(see http://qa.php.net/write-test.php), several new sections are available
66under server-tests.
67
68--POST--
69This is not a new section, but not multipart posts are supported for testing
70file uploads, or other types of POST data.
71
72--CGI--
73This section takes no value. It merely provides a simple marker for tests
74that MUST be run as CGI, even if there is no --POST-- or --GET-- sections
75in the test file.
76
77--DESCRIPTION--
78Not used for anything, just a section for documenting the test
79
80--ENV--
81This section get's eval()'d to help build an environment for the
82execution of the test. This can be used to change environment
83vars that are used for CGI emulation, or simply to set env vars
84for cli testing. A full example looks like:
85
86 --ENV--
87 return <<<END
88 PATH_TRANSLATED=$filename
89 PATH_INFO=$scriptname
90 SCRIPT_NAME=$scriptname
91 END;
92
93Some variables are made easily available for use in this section, they
94include:
95 $filename full native path to file, will become PATH_TRANSLATED
96 $filepath =dirname($filename)
97 $scriptname this is what will become SCRIPT_NAME unless you override it
98 $docroot the equivelant of DOCUMENT_ROOT under Apache
99 $cwd the directory that the test is being initiated from
100 $this->conf all server-tests configuration vars
101 $this->env all environment variables that will get passed to the test
102
103
104--REQUEST--
105This section is also eval'd, and is similar in nature to --ENV--. However,
106this section is used to build the url used in an HTTP request. Valid values
107to set in this section would include:
108 SCRIPT_NAME The inital part of the request url
109 PATH_INFO The pathinfo part of a request url
110 FRAGMENT The fragment section of a url (after #)
111 QUERY_STRING The query part of a url (after ?)
112
113 --REQUEST--
114 return <<<END
115 PATH_INFO=/path/info
116 END;
117
118--HEADERS--
119This section is also eval'd. It is used to provide additional headers sent
120in an HTTP request, such as content type for multipart posts, cookies, etc.
121
122 --HEADERS--
123 return <<<END
124 Content-Type=multipart/form-data; boundary=---------------------------240723202011929
125 Content-Length=100
126 END;
127
128--EXPECTHEADERS--
129This section can be used to define what headers are required to be
130received back from a request, and is checked in addition to the
131regular expect sections. For example:
132
133 --EXPECTHEADERS--
134 Status: 404
135
136
137
138
README.UNICODE
1Audience
2========
3
4This README describes how PHP 6 provides native support for the Unicode
5Standard. Readers of this document should be proficient with PHP and have a
6basic understanding of Unicode concepts. For more technical details about
7PHP 6 design principles and for guidelines about writing Unicode-ready PHP
8extensions, refer to README.UNICODE-UPGRADES.
9
10Introduction
11============
12
13As successful as PHP has proven to be over the years, its support for
14multilingual and multinational environments has languished. PHP can no
15longer afford to remain outside the overall movement towards the Unicode
16standard. Although recent updates involving the mbstring extension have
17enabled easier multibyte data processing, this does not constitute native
18Unicode support.
19
20Since the full implementation of the Unicode Standard is very involved, our
21approach is to speed up implementation by using the well-tested,
22full-featured, and freely available ICU (International Components for
23Unicode) library.
24
25
26General Remarks
27===============
28
29International Components for Unicode
30------------------------------------
31
32ICU (International Components for Unicode is a mature, widely used set of
33C/C++ and Java libraries for Unicode support, software internationalization
34and globalization. It provides:
35
36 - Encoding conversions
37 - Collations
38 - Unicode text processing
39 - and much more
40
41When building PHP 6, Unicode support is always enabled. The only
42configuration option during development should be the location of the ICU
43headers and libraries.
44
45 --with-icu-dir=<dir>
46
47where <dir> specifies the location of ICU header and library files. If you do
48not specify this option, PHP attempts to find ICU under /usr and /usr/local.
49
50NOTE: ICU is not bundled with PHP 6 yet. To download the distribution, visit
51http://icu.sourceforge.net. PHP requires ICU version 3.4 or higher.
52
53Backwards Compatibility
54-----------------------
55Our paramount concern for providing Unicode support is backwards compatibility.
56Because PHP is used on so many sites, existing data types and functions must
57work as they always have. However, although PHP's interfaces must remain
58backwards-compatible, the speed of certain operations might be affected due to
59internal implementation changes.
60
61Encoding Names
62--------------
63All the encoding settings discussed in this document can accept any valid
64encoding name supported by ICU. For a full list of encodings, refer to the ICU
65online documentation.
66
67NOTE: References to "Unicode" in this document generally mean the UTF-16
68character encoding, unless explicitly stated otherwise.
69
70Unicode Semantics Switch
71========================
72
73Because many applications do not require Unicode, PHP 6 provides a server-wide
74INI setting to enable Unicode support:
75
76 unicode.semantics = On/Off
77
78This switch is off by default. If your applications do not require native
79Unicode support, you may leave this switch off, and continue to use Unicode
80strings only when you need to.
81
82However, if your application is ready to fully support Unicode, you should
83turn this switch on. This activates various Unicode support mechanisms,
84including:
85
86 * All string literals become Unicode
87 * All variables received from HTTP requests become Unicode
88 * PHP identifiers may use Unicode characters
89
90More fundamentally, your PHP environment is now a Unicode environment. Strings
91inside PHP are Unicode, and the system is responsible for converting non-Unicode
92strings on PHP's periphery (for example, in HTTP input and output, streams, and
93filesystem operations). With unicode.semantics on, you must specify binary
94strings explicitly. PHP makes no assumptions about the content of a binary
95string, so your application must handle all binary string appropriately.
96
97Conversely, if unicode.semantics is off, PHP behaves as it did in the past.
98String literals do not become Unicode, and files are binary strings for
99backwards compatibility. You can always create Unicode strings programmatically,
100and all functions and operators support Unicode strings transparently.
101
102
103Fallback Encoding
104=================
105
106The fallback encoding provides a default value for all other unicode.*_encoding
107INI settings. If you do not set a particular unicode.*_encoding setting, PHP
108uses the fallback encoding. If you do not specify a fallback encoding, PHP uses
109UTF-8.
110
111 unicode.fallback_encoding = "iso-8859-1"
112
113
114Runtime Encoding
115================
116
117The runtime encoding specifies the encoding PHP uses for converting binary
118strings within the PHP engine itself.
119
120 unicode.runtime_encoding = "iso-8859-1"
121
122This setting has no effect on I/O-related operations such as writing to
123standard out, reading from the filesystem, or decoding HTTP input variables.
124
125PHP enables you to explicitly convert strings using casting:
126
127 * (binary) -- casts to binary string type
128 * (unicode) -- casts to Unicode string type
129 * (string) -- casts to Unicode string type if unicode.semantics is on,
130 to binary otherwise
131
132For example, if unicode.runtime_encoding is iso-8859-1, and $uni is a unicode
133string, then
134
135 $str = (binary)$uni
136
137creates a binary string $str in the ISO-8859-1 encoding.
138
139Implicit conversions include concatenation, comparison, and parameter passing.
140For better precision, PHP attempts to convert strings to Unicode before
141performing these sorts of operations. For example, if we concatenate our binary
142string $str with a unicode literal, PHP converts $str to Unicode first, using
143the encoding specified by unicode.runtime_encoding.
144
145Output Encoding
146===============
147
148PHP automatically converts output for commands that write to the standard
149output stream, such as 'print' and 'echo'.
150
151 unicode.output_encoding = "utf-8"
152
153However, PHP does not convert binary strings. When writing to files or external
154resources, you must rely on stream encoding features or manually encode the data
155using functions provided by the unicode extension.
156
157The existing default_charset INI setting is DEPRECATED in favor of
158unicode.output_setting. Previously, default_charset only specified the charset
159portion of the Content-Type MIME header. Now default_charset only takes effect
160when unicode.semantics is off, and it does not affect the actual transcoding of
161the output stream. Setting unicode.output_encoding causes PHP to add the
162'charset' portion to the Content-Type header, overriding any value set for
163default_charset.
164
165
166HTTP Input Encoding
167===================
168
169The HTTP input encoding specifies the encoding of variables received via
170HTTP, such as the contents of the $_GET and _$POST arrays.
171
172This functionality is currently under development. For a discussion of the
173approach that the PHP 6 team is taking, refer to:
174
175http://marc.theaimsgroup.com/?t=116613047300005&r=1&w=2
176
177
178Filesystem Encoding
179===================
180
181The filesystem encoding specifies the encoding of file and directory names
182on the filesystem.
183
184 unicode.filename_encoding = "utf-8"
185
186Filesystem-related functions such as opendir() perform this conversion when
187accepting and returning file names. You should set the filename encoding to
188the encoding used by your filesystem.
189
190
191Script Encoding
192===============
193
194You may write PHP scripts in any encoding supported by ICU. To specify the
195script encoding site-wide, use the INI setting:
196
197 unicode.script_encoding = utf-8
198
199If you cannot change the encoding system wide, you can use a pragma to
200override the INI setting in a local script:
201
202 <?php declare(encoding = 'Shift-JIS'); ?>
203
204The pragma setting must be the first statement in the script. It only affects
205the script in which it occurs, and does not propagate to any included files.
206
207
208INI Files
209=========
210
211If unicode.semantics is on, INI files are presumed to contain UTF-8 encoded
212keys and values. If unicode.semantics is off, the data is taken as-is,
213similar to PHP 5. No validation occurs during parsing. Instead invalid UTF-8
214sequences are caught during access by ini_*() functions.
215
216
217Stream I/O
218==========
219
220PHP has a streams-based I/O system for generalized filesystem access,
221networking, data compression, and other operations. Since the data on the
222other end of the stream can be in any encoding, you need to think about
223data conversion.
224
225Okay, this needs to be clarified. By "default", streams are actually
226opened in binary mode. You have to specify 't' flag or use FILE_TEXT in
227order to open it in text mode, where conversions apply. And for the text
228mode streams, the default stream encoding is UTF-8 indeed.
229
230By default, PHP opens streams in binary mode. To open a file in text mode,
231you must use the 't' flag (or the FILE_TEXT parameter -- see below). The
232default encoding for streams in text mode is UTF-8. This means that if
233'file.txt' is a UTF-8 text file, this code snippet:
234
235 $fp = fopen('file.txt', 'rt');
236 $str = fread($fp, 100)
237
238returns 100 Unicode characters, while:
239
240 $fp = fopen('file.txt', 'wt');
241 $fwrite($fp, $uni)
242
243writes to a UTF-8 text file.
244
245If you mainly work with files in an encoding other than UTF-8, you can
246change the default context encoding setting:
247
248 stream_default_encoding('Shift-JIS');
249 $data = file_get_contents('file.txt', FILE_TEXT);
250 // work on $data
251 file_put_contents('file.txt', $data, FILE_TEXT);
252
253The file_get_contents() and file_put_contents() functions now accept an
254additional parameter, FILE_TEXT. If you provide FILE_TEXT for
255file_get_contents(), PHP returns a Unicode string. Without FILE_TEXT, PHP
256returns a binary string (which would be appropriate for true binary data, such
257as an image file). When writing a Unicode string with file_put_contents(), you
258must supply the FILE_TEXT parameter, or PHP generates a warning.
259
260If you need to work with multiple encodings, you can create custom contexts
261using stream_context_create() and then pass in the custom context as an
262additional parameter. For example:
263
264 $ctx = stream_context_create(NULL, array('encoding' => 'big5'));
265 $data = file_get_contents('file.txt', FILE_TEXT, $ctx);
266 // work on $data
267 file_put_contents('file.txt', $data, FILE_TEXT, $ctx);
268
269
270Conversion Semantics and Error Handling
271=======================================
272
273PHP can convert strings explicitly (casting) and implicitly (concatenation,
274comparison, and parameter passing. For example, when concatenating a Unicode
275string and a binary string, PHP converts the binary string to Unicode for better
276precision.
277
278However, not all characters can be converted between Unicode and legacy
279encodings. The first possibility is that a string contains corrupt data or
280an illegal byte sequence. In this case, the converter simply stops with
281a message that resembles:
282
283 Warning: Could not convert binary string to Unicode string
284 (converter UTF-8 failed on bytes (0xE9) at offset 2)
285
286Conversely, if a similar error occurs when attempting to convert Unicode to
287a legacy string, the converter generates a message that resembles:
288
289 Warning: Could not convert Unicode string to binary string
290 (converter ISO-8859-1 failed on character {U+DC00} at offset 2)
291
292To customize this behavior, refer to "Creating a Custom Error Handler" below.
293
294The second possibility is that a Unicode character simply cannot be represented
295in the legacy encoding. By default, when downconverting from Unicode, the
296converter substitutes any missing sequences with the appropriate substitution
297sequence for that codepage, such as 0x1A (Control-Z) in ISO-8859-1. When
298upconverting to Unicode, the converter replaces any byte sequence that has no
299Unicode equivalent with the Unicode substitution character (U+FFFD).
300
301You can customize the conversion error behavior to:
302
303 - stop the conversion and return an empty string
304 - skip any invalid characters
305 - substitute invalid characters with a custom substitution character
306 - escape the invalid character in various formats
307
308To control the global conversion error settings, use the functions:
309
310 unicode_set_error_mode(int direction, int mode)
311 unicode_set_subst_char(unicode char)
312
313where direction is either FROM_UNICODE or TO_UNICODE, and mode is one of these
314constants:
315
316 U_CONV_ERROR_STOP
317 U_CONV_ERROR_SKIP
318 U_CONV_ERROR_SUBST
319 U_CONV_ERROR_ESCAPE_UNICODE
320 U_CONV_ERROR_ESCAPE_ICU
321 U_CONV_ERROR_ESCAPE_JAVA
322 U_CONV_ERROR_ESCAPE_XML_DEC
323 U_CONV_ERROR_ESCAPE_XML_HEX
324
325As an example, with a runtime encoding of ISO-8859-1, the conversion:
326
327 $str = (binary)"< \u30AB >";
328
329results in:
330
331 MODE RESULT
332 --------------------------------------
333 stop ""
334 skip "< >"
335 substitute "< ? >"
336 escape (Unicode) "< {U+30AB} >"
337 escape (ICU) "< %U30AB >"
338 escape (Java) "< \u30AB >"
339 escape (XML decimal) "< カ >"
340 escape (XML hex) "< カ >"
341
342With a runtime encoding of UTF-8, the conversion of the (illegal) sequence:
343
344 $str = (unicode)b"< \xe9\xfe >";
345
346results in:
347
348 MODE RESULT
349 --------------------------------------
350 stop ""
351 skip ""
352 substitute ""
353 escape (Unicode) "< %XE9%XFE >"
354 escape (ICU) "< %XE9%XFE >"
355 escape (Java) "< \xE9\xFE >"
356 escape (XML decimal) "< éþ >"
357 escape (XML hex) "< éþ >"
358
359The substitution character can be set only for FROM_UNICODE direction and has to
360exist in the target character set. The default substitution character is (?).
361
362NOTE: Casting is just a shortcut for using unicode.runtime_encoding. To convert
363using an alternative encoding, use the unicode_encode() and unicode_decode()
364functions. For example,
365
366 $str = unicode_encode($uni, 'koi8-r', U_CONV_ERROR_SUBST);
367
368results in a binary KOI8-R encoded string.
369
370Creating a Custom Error Handler
371-------------------------------
372If an error occurs during the conversion, PHP outputs a warning describing the
373problem. Instead of this default behavior, PHP can invoke a user-provided error
374handler, similar to how the current user-defined error handler works. To set
375the custom conversion error handler, call:
376
377 mixed unicode_set_error_handler(callback error_handler)
378
379The function returns the previously defined custom error handler. If no error
380handler was defined, or if an error occurs when returning the handler, this
381function returns NULL.
382
383When the custom handler is set, the standard error handler is bypassed. It is
384the responsibility of the custom handler to output or log any messages, raise
385exceptions, or die(), if necessary. However, if the custom error handler returns
386FALSE, the standard handler will be invoked afterwards.
387
388The user function specified as the error_handler must accept five parameters:
389
390 mixed error_handler($direction, $encoding, $char_or_byte, $offset,
391 $message)
392
393where:
394
395 $direction - the direction of conversion, FROM_UNICODE/TO_UNICODE
396
397 $encoding - the name of the encoding to/from which the conversion
398 was attempted
399
400 $char_or_byte - either Unicode character or byte sequence (depending
401 on direction) which caused the error
402
403 $offset - the offset of the failed character/byte sequence in
404 the source string
405
406 $message - the error message describing the problem
407
408NOTE: If the error mode set by unicode_set_error_mode() is substitute,
409skip, or escape, the handler won't be called, since these are non-error
410causing operations. To always invoke your handler, set the error mode to
411U_CONV_ERROR_STOP.
412
413
414Unicode String Type
415===================
416
417The Unicode string type (IS_UNICODE) is supposed to contain text data encoded in
418UTF-16. This is the main string type in PHP when Unicode semantics switch is
419turned on. Unicode strings can exist when the switch is off, but they have to be
420produced programmatically via calls to functions that return Unicode types.
421
422
423Binary String Type
424==================
425
426Binary string type (IS_STRING) serves two purposes: backwards compatibility and
427representing non-Unicode strings and binary data. When Unicode semantics switch
428is off, it is used for all strings in PHP, same in previous versions. When the
429switch is on, this type will be used to store text in other encodings as well as
430true binary data such as images, PDFs, etc.
431
432Printing binary data to the standard output passes it through as-is, independent
433of the output encoding.
434
435For examples of specifying binary string literals, refer to the section
436"Language Modfications".
437
438Language Modifications
439======================
440
441If a Unicode switch is turned on, PHP string literals -- single-quoted,
442double-quoted, and heredocs -- become Unicode strings (IS_UNICODE type). String
443literals support all the same escape sequences and variable interpolations as
444before, plus several new escape sequences.
445
446PHP interprets the contents of strings as follows:
447
448 - all non-escaped characters are interpreted as a corresponding Unicode
449 codepoint based on the current script encoding, e.g. ASCII 'a' (0x61) =>
450 U+0061, Shift-JIS (0x92 0x86) => U+4E2D
451
452 - existing PHP escape sequences are also interpreted as Unicode codepoints,
453 including \xXX (hex) and \OOO (octal) numbers, e.g. "\x20" => U+0020
454
455 - two new escape sequences, \uXXXX and \UXXXXXX, are interpreted as a 4 or
456 6-hex Unicode codepoint value, e.g. \u0221 => U+0221, \U010410 =>
457 U+10410. (Having two sequences avoids the ambiguity of \u020608 --
458 is that supposed to be U+0206 followed by "08", or U+020608 ?)
459
460 - a new escape sequence allows specifying a character by its full
461 Unicode name, e.g. \C{THAI CHARACTER PHO SAMPHAO} => U+0E20
462
463PHP allows variable interpolation inside the double-quoted and heredoc strings.
464However, the parser separates the string into literal and variable chunks during
465compilation, e.g. "abc $var def" -> "abc" . $var . "def". This means that PHP
466can handle literal chunks in the normal way as far as Unicode support is
467concerned.
468
469Since all string literals become Unicode by default, PHP 6 introduces new syntax
470for creating byte-oriented or binary strings. Prefixing a string literal with
471the letter 'b' creates a binary string:
472
473 $var = b'abc\001';
474 $var = b"abc\001";
475 $var = b<<<EOD
476 abc\001
477 EOD;
478
479The content of a binary string is the literal byte sequence inside the
480delimiters, which depends on the script encoding (unicode.script_encoding).
481Binary string literals support the same escape sequences as PHP 5 strings. If
482the Unicode switch is turned off, then the binary string literals generate the
483normal string (IS_STRING) type internally without any effect on the application.
484
485The string operators now accomodate the new IS_UNICODE and IS_BINARY types:
486
487 - The concatenation operator (.) and concatenation assignment operator (.=)
488 automatically coerce the IS_STRING type to the more precise IS_UNICODE if
489 the operands are of different string types.
490
491 - The string indexing operator [] now accommodates IS_UNICODE type strings
492 and extracts the specified character. To support supplementary characters,
493 the index specifies a code point, not a byte or a code unit.
494
495 - Bitwise operators and increment/decrement operators do not work on
496 Unicode strings. They do work on binary strings.
497
498 - Two new casting operators are introduced, (unicode) and (binary). The
499 (string) operator casts to Unicode type if the Unicode semantics switch is
500 on, and to binary type otherwise.
501
502 - The comparison operators compare Unicode strings in binary code point
503 order. They also coerce strings to Unicode if the strings are of different
504 types.
505
506 - The arithmetic operators use the same semantics as today for converting
507 strings to numbers. A Unicode string is considered numeric if it
508 represents a long or a double number in the en_US_POSIX locale.
509
510
511Unicode Support in Existing Functions
512=====================================
513
514All functions in the PHP default distribution are undergoing analysis to
515determine which functions need to be upgraded for native Unicode support.
516You can track progress here:
517
518 http://www.php.net/~scoates/unicode/render_func_data.php
519
520Key extensions that are fully converted include:
521
522 * curl
523 * dom
524 * json
525 * mysql
526 * mysqli
527 * oci8
528 * pcre
529 * reflection
530 * simplexml
531 * soap
532 * sqlite
533 * xml
534 * xmlreader/xmlwriter
535 * xsl
536 * zlib
537
538NOTE: Unsafe functions might still work, since PHP performs Unicode conversions
539at runtime. However, unsafe functions might not work correctly with multibyte
540binary strings, or Unicode characters that are not representable in the
541specified unicode.runtime_encoding.
542
543
544Identifiers
545===========
546
547Since scripts may be written in various encodings, we do not restrict
548identifiers to be ASCII-only. PHP allows any valid identifier based
549on the Unicode Standard Annex #31.
550
551
552Numbers
553=======
554
555Unlike identifiers, numbers must consist only of ASCII digits,.and are
556restricted to the en_US_POSIX or C locale. In other words, numbers have no
557thousands separator, and the fractional separator is (.) "full stop". Numeric
558strings adhere to the same rules, so "10,3" is not interpreted as a number even
559if the current locale's fractional separator is a comma.
560
561TextIterators
562=============
563
564Instead of using the offset operator [] to access characters in a linear
565fashion, use a TextIterator instead. TextIterator is very fast and enables you
566to iterate over code points, combining sequences, characters, words, lines, and
567sentences, both forward and backward. For example:
568
569 $text = "nai\u308ve";
570 foreach (new TextIterator($text) as $u) {
571 var_inspect($u)
572 }
573
574lists six code points, including the umlaut (U+0308) as a separate code point.
575Instantiating the TextIterator to iterate over characters,
576
577 $text = "nai\u308ve";
578 foreach (new TextIterator($text, TextIterator::CHARACTER) as $u) {
579 var_inspect($u)
580 }
581
582lists five characters, including an "i" with an umlaut as a single character.
583
584Locales
585=======
586
587Unicode support in PHP relies exclusively on ICU locales, NOT the POSIX locales
588installed on the system. You may access the default ICU locale using:
589
590 locale_set_default()
591 locale_get_default()
592
593ICU locale IDs have a somewhat different format from POSIX locale IDs. The ICU
594syntax is:
595
596 <language>[_<script>]_<country>[_<variant>][@<keywords>]
597
598For example, sr_Latn_YU_REVISED@currency=USD is Serbian (Latin, Yugoslavia,
599Revised Orthography, Currency=US Dollar).
600
601Do not use the deprecated setlocale() function. This function interacts with the
602POSIX locale. If Unicode semantics are on, using setlocale() generates
603a deprecation warning.
604
605Document TODO
606==========================================
607- Final review.
608- Fix the HTTP Input Encoding section, that's obsolete now.
609
610
611References
612==========
613
614 Unicode
615 http://www.unicode.org
616
617 Unicode Glossary
618 http://www.unicode.org/glossary/
619
620 UTF-8
621 http://www.utf-8.com/
622
623 UTF-16
624 http://www.ietf.org/rfc/rfc2781.txt
625
626 ICU Homepage
627 http://www.ibm.com/software/globalization/icu/
628
629 ICU User Guide and API Reference
630 http://icu.sourceforge.net/
631
632 Unicode Annex #31
633 http://www.unicode.org/reports/tr31/
634
635 PHP Parameter Parsing API
636 http://www.php.net/manual/en/zend.arguments.retrieval.php
637
638
639Authors
640=======
641 Andrei Zmievski <andrei@gravitonic.com>
642 Evan Goer <goer@yahoo-inc.com>
643
644vim: set et tw=80 :
645
README.UNICODE-UPGRADES
1This document attempts to describe portions of the API related to the new
2Unicode functionality and the best practices for upgrading existing
3functions to support Unicode.
4
5Your first stop should be README.UNICODE: it covers the general Unicode
6functionality and concepts without going into technical implementation
7details.
8
9Internal Encoding
10=================
11
12UTF-16 is the internal encoding used for Unicode strings. UTF-16 consumes
13two bytes for any Unicode character in the Basic Multilingual Plane, which
14is where most of the current world's languages are represented. While being
15less memory efficient for basic ASCII text it simplifies the processing and
16makes interfacing with ICU easier, since ICU uses UTF-16 for its internal
17processing as well.
18
19
20Zval Structure Changes
21======================
22
23For IS_UNICODE type, we add another structure to the union:
24
25 union {
26 ....
27 struct {
28 UChar *val; /* Unicode string value */
29 int len; /* number of UChar's */
30 } ustr;
31 ....
32 } value;
33
34This cleanly separates the two types of strings and helps preserve backwards
35compatibility.
36
37To optimize access to IS_STRING and IS_UNICODE storage at runtime, we need yet
38another structure:
39
40 union {
41 ....
42 struct { /* Universal string type */
43 zstr val;
44 int len;
45 } uni;
46 ....
47 } value;
48
49Where zstr is a union of char*, UChar*, and void*.
50
51
52Parameter Parsing API Modifications
53===================================
54
55There are now five new specifiers: 'u', 't', 'T', 'U', 'S', 'x' and a new '&'
56modifier.
57
58 't' specifier
59 -------------
60 This specifier indicates that the caller requires the incoming parameter to be
61 string data (IS_STRING, IS_UNICODE). The caller has to provide the storage for
62 string value, length, and type.
63
64 void *str;
65 int len;
66 zend_uchar type;
67
68 if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "t", &str, &len, &type) == FAILURE) {
69 return;
70 }
71 if (type == IS_UNICODE) {
72 /* process Unicode string */
73 } else {
74 /* process binary string */
75 }
76
77 For IS_STRING type, the length represents the number of bytes, and for
78 IS_UNICODE the number of UChar's. When converting other types (numbers,
79 booleans, etc) to strings, the exact behavior depends on the Unicode semantics
80 switch: if on, they are converted to IS_UNICODE, otherwise to IS_STRING.
81
82
83 'u' specifier
84 -------------
85 This specifier indicates that the caller requires the incoming parameter
86 to be a Unicode encoded string. If a non-Unicode string is passed, the engine
87 creates a copy of the string and automatically convert it to Unicode type before
88 passing it to the internal function. No such conversion is necessary for Unicode
89 strings, obviously.
90
91 UChar *str;
92 int len;
93
94 if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "u", &str, &len) == FAILURE) {
95 return;
96 }
97 /* process Unicode string */
98
99
100 'T' specifier
101 -------------
102 This specifier is useful when the function takes two or more strings and
103 operates on them. Using 't' specifier for each one would be somewhat
104 problematic if the passed-in strings are of mixed types, and multiple
105 checks need to be performed in order to do anything. All parameters
106 marked by the 'T' specifier are promoted to the same type.
107
108 If at least one of the 'T' parameters is of Unicode type, then the rest of
109 them are converted to IS_UNICODE. Otherwise all 'T' parameters are conveted to
110 IS_STRING type.
111
112
113 void *str1, *str2;
114 int len1, len2;
115 zend_uchar type1, type2;
116
117 if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "TT", &str1, &len1,
118 &type1, &str2, &len2, &type2) == FAILURE) {
119 return;
120 }
121 if (type1 == IS_UNICODE) {
122 /* process as Unicode, str2 is guaranteed to be Unicode as well */
123 } else {
124 /* process as binary string, str2 is guaranteed to be the same */
125 }
126
127
128 'x' specifier
129 -------------
130 This specifier acts as either 'u' or 's', depending on the value of the
131 unicode semantics switch. If UG(unicode) is on, it behaves as 'u', and as
132 's' otherwise.
133
134The existing 's' specifier has been modified as well. If a Unicode string is
135passed in, it automatically copies and converts the string to the runtime
136encoding, and issues a warning. If a binary type is passed-in, no conversion
137is necessary. The '&' modifier can be used after 's' specifier to force
138a different converter instead.
139
140 char *str;
141 int len;
142
143 if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s&", &str, &len, UG(utf8_conv)) == FAILURE) {
144 return;
145 }
146 /* here str is in UTF-8, if a Unicode string was passed in */
147
148The 'U' and 'S' specifiers are similar to 'u' and 's' but they are more strict
149about the type of the passed-in parameter. If 'U' is specified and the binary
150string is passed in, the engine will issue a warning instead of doing automatic
151conversion. The converse applies to the 'S' specifier.
152
153
154Working in Unicode World
155========================
156
157Strings
158-------
159
160A lot of internal functionality is controlled by the unicode.semantics
161switch. Its value is found in the Unicode globals variable, UG(unicode). It
162is either on or off for the entire request.
163
164The big thing is that there is a new string types: IS_UNICODE.
165This has its own storage in the value union part of
166zval (value.ustr) while non-unicode (binary) strings reuse the
167IS_STRING type and the value.str element of the zval.
168
169New macros exist (parallel to Z_STRVAL/Z_STRLEN) for accessing unicode strings.
170
171Z_USTRVAL(), Z_USTRLEN()
172 - accesses the value (as a UChar*) and length (in code units) of the Unicode type string
173 value.ustr.val value.ustr.len
174
175Z_UNIVAL(), Z_UNILEN()
176 - accesses the value (as a zstr) and length (in type-appropriate units)
177 value.uni.val value.uni.len
178
179Z_USTRCPLEN()
180 - gives the number of codepoints (not units) in the Unicode type string
181 This macro examines the actual string taking into account Surrogate Pairs
182 and returns the number of UChar32(UTF32) codepoints which may be less than the
183 number of UChar(UTF16) codeunits found in the string buffer.
184 If this value will be used repeatedly, consider storing it in a local variable
185 to avoid having to reexamine the string every time.
186
187
188ZVAL_* macros
189-------------
190
191The 'dup' parameter to the ZVAL_STRING()/RETVAL_STRING()/RETURN_STRING() type
192macros has been extended slightly. The following defines are now encouraged instead:
193
194#define ZSTR_DUPLICATE (1<<0)
195#define ZSTR_AUTOFREE (1<<1)
196
197ZSTR_DUPLICATE (which has a resulting value of 1) serves the same purpose as a
198truth value in old-style 'dup' flags. The value of 1 was specifically chosen
199to match the common practice of passing a 1 for this parameter.
200Warning: If you find extension code which uses a truth value other than one for
201the dup flag, its logic should be modified to explicitly pass ZSTR_DUPLICATE instead.
202
203ZSTR_AUTOFREE is used with macros such as ZVAL_RT_STRING which may populate Unicode
204zvals from non-unicode source strings. When UG(unicode) is on, the source string
205will be implicitly copied (to make a UChar* version). If the original string
206needed copying anyway this is fine. However if the original string was emalloc()'d
207and would have ordinarily been given to the engine (i.e. RETURN_STRING(estrdup("foo"), 0))
208then it will need to be freed in UG(unicode) mode to avoid leaking.
209The ZSTR_AUTOFREE flag ensures that the original string is freed in UG(unicode) mode.
210
211ZVAL_UNICODE(pzv, str, dup), ZVAL_UNICODEL(pzv, str, str_len, dup)
212 - Sets zval to hold a Unicode string. Takes the same parameters as
213 Z_STRING(), Z_STRINGL().
214
215ZVAL_U_STRING(conv, pzv, str, dup), ZVAL_U_STRINGL(conv, pzv, str, str_len, dup)
216 - When UG(unicode) is off, it's equivalent to Z_STRING(), ZSTRINGL()
217 and the conv parameter is ignored.
218 When UG(unicode) is on, it sets zval to hold a Unicode representation of the
219 passed-in string using the UConverter* specified by conv.
220 Since a new string is always created in this case, passing ZSTR_DUPLICATE
221 for 'dup' does not matter, but ZSTR_AUTOFREE will be used will be used to
222 efree the original value
223
224ZVAL_RT_STRING(pzv, str, dup), ZVAL_RT_STRINGL(pzv, str, str_len, dup)
225 - When UG(unicode) is off, it's equivalent to Z_STRING(), Z_STRINGL().
226 When UG(unicode) is on, it takes the input string, converts it to Unicode
227 using the runtime_encoding converter and sets zval to it.
228 Since a new string is always created in this case, passing ZSTR_DUPLICATE
229 for 'dup' does not matter, but ZSTR_AUTOFREE will be used will be used to
230 efree the original value
231
232ZVAL_ASCII_STRING(pzv, str, dup), ZVAL_ASCII_STRINGL(pzv, str, str_len, dup)
233 - When UG(unicode) is off, it's equivalent to Z_STRING(), Z_STRINGL().
234 When UG(unicode) is on, it takes the input string, converts it to Unicode
235 using an ASCII converter and sets zval to it.
236 Since a new string is always created in this case, passing ZSTR_DUPLICATE
237 for 'dup' does not matter, but ZSTR_AUTOFREE will be used will be used to
238 efree the original value
239
240ZVAL_UTF8_STRING(pzv, str, dup), ZVAL_UTF8_STRINGL(pzv, str, str_len, dup)
241 - When UG(unicode) is off, it's equivalent to Z_STRING(), Z_STRINGL().
242 When UG(unicode) is on, it takes the input string, converts it to Unicode
243 using a UTF8 converter and sets zval to it.
244 Since a new string is always created in this case, passing ZSTR_DUPLICATE
245 for 'dup' does not matter, but ZSTR_AUTOFREE will be used will be used to
246 efree the original value
247
248ZVAL_ZSTR(pzv, type, zstr, dup), ZVAL_ZSTRL(pzv, type, zstr, zstr_len, dup)
249 - This macro uses 'type' to switch between calling ZVAL_STRING(pzv, zstr.s, dup)
250 and ZVAL_UNICODE(pzv, zstr.u, dup). No conversion happens so the
251 presense of absense of ZSTR_AUTOFREE is ignored.
252
253ZVAL_TEXT(pzv, zstr, dup), ZVAL_TEXTL(pzv, zstr, zstr_len, dup)
254 - This macro sets the zval to hold either a Unicode or a normal string,
255 depending on the value of UG(unicode). No conversion happens, so be certain
256 that the string passed in matches the type expected by UG(unicode).
257 One example of its usage would be to initialize zval to hold the name
258 of a user function.
259
260ZVAL_EMPTY_UNICODE(pzv) / ZVAL_EMPTY_TEXT(pzv)
261 - These macros work identically to ZVAL_EMPTY_STRING() with the UNICODE
262 version always generating an IS_UNICODE zval, and the TEXT version
263 generating a UG(unicode) dependent string type.
264
265ZVAL_UCHAR32(pzv, char)
266 - Converts the character provided into a UChar string (which may potentially
267 be 1 or 2 characters long in the case of surrogate pairs) and dispatches
268 to ZVAL_UNICODEL().
269
270
271As usual, for each ZVAL_* macro, there is a matching RETVAL_* and RETURN_* macro.
272
273Conversion Macros
274-----------------
275
276convert_to_string_with_converter(zval *op, UConverter *conv)
277 - converts a zval to native string using the specified converter, if necessary.
278
279convert_to_unicode_with_converter(zval *op, UConverter *conv)
280 - converts a zval to Unicode string using the specified converter, if
281 necessary.
282
283convert_to_unicode(zval *op)
284 - converts a zval to Unicode string.
285
286convert_to_string(zval *op)
287 - Behaves just as it currently does, converting to IS_STRING type
288
289convert_to_text(zval *op)
290 - converts a zval to either Unicode or native string, depending on the
291 value of UG(unicode) switch
292
293zend_ascii_to_unicode() function can be used to convert an ASCII char*
294string to Unicode. This is useful especially for inline string literals, in
295which case you can simply use USTR_MAKE() macro, e.g.:
296
297 UChar* ustr;
298
299 ustr = USTR_MAKE("main");
300
301If you need to initialize a few such variables, it may be more efficient to
302use ICU macros, which avoid the conversion, depending on the platform. See
303[1] for more information.
304
305USTR_FREE(zstr) can be used to free a UChar* string safely, since it checks for
306NULL argument. USTR_LEN() takes a zstr as its argument, and
307depending on the UG(unicode) value, and returns its strlen() or u_strlen().
308
309Array Manipulation
310------------------
311
312The add_next_index_*(), add_index_*() and add_assoc_*() functions have been
313significantly expanded both to allow for the unicode type as a value and to
314permit various types of keys.
315
316Values: In the following examples, {1} represents a placeholder for the keytype and
317its arguments (covered later).
318
319add_{1}_unicode(zval *arr, {1}, UChar *ustr, int dup);
320add_{1}_unicodel(zval *arr, {1}, UChar *ustr, int ustr_len, int dup);
321 - Works like add_{1}_string() and add_{1}_stringl() but takes a UChar* value
322 and adds an IS_UNICODE type.
323
324add_{1}_rt_string(zval *arr, {1}, char *str, int dup);
325add_{1}_rt_stringl(zval *arr, {1}, char *str, int str_len, int dup);
326 - Works like add_{1}_string() and add_{1}_stringl() but converts the char*
327 value to Unicode using runtime encoding when UG(unicode) is on.
328
329add_{1}_ascii_string(zval *arr, {1}, char *str, int dup);
330add_{1}_ascii_stringl(zval *arr, {1}, char *str, int str_len, int dup);
331 - Works like add_{1}_rt_string() and add_{1}_rt_stringl() but uses
332 an ASCII converter rather than runtime encoding.
333
334add_{1}_utf8_string(zval *arr, {1}, char *str, int dup);
335add_{1}_utf8_stringl(zval *arr, {1}, char *str, int str_len, int dup);
336 - Works like add_{1}_rt_string() and add_{1}_rt_stringl() but uses
337 a UTF8 converter rather than runtime encoding.
338
339add_{1}_text(zval *arr, {1}, zstr str, int dup);
340add_{1}_textl(zval *arr, {1}, zstr str, int str_len, int dup);
341 - Wrapper which dispatches to add_{1}_string(l)() or add_{1}_unicode(l)()
342 depending on the setting of UG(unicode).
343
344add_{1}_zstr(zval *arr, {1}, zend_uchar type, zstr str, int dup);
345add_{1}_zstrl(zval *arr, {1}, zend_uchar type, zstr str, int str_len, int dup);
346 - Works like add_{1}_text() and add_{1}_textl(), but dispatches based on 'type'.
347
348
349Keys: In the following example, the zval* type is used for values, however
350each of the value types (including those listed above) are supported.
351
352The existing key types work as they always have:
353 add_next_index_zval(zval *arr, zval *val);
354 add_index_zval(zval *arr, long idx, zval *val);
355 add_assoc_zval(zval *arr, char *key, zval *val);
356 add_assoc_zval_ex(zval *arr, char *key, int key_len, zval *val);
357 . Associative keys are considered binary (IS_STRING)
358 . Remember that key_len includes the terminating NULL
359
360The following additional methods provide unicode capable keytypes:
361
362add_u_assoc_zval(zval *arr, zend_uchar type, zstr key, zval *val);
363add_u_assoc_zval_ex(zval *arr, zend_uchar type, zstr key, int key_len, zval *val);
364 . When type==IS_STRING, these behave identically to their
365 add_assoc_zval() and add_assoc_zval_ex() counterparts.
366 When type==IS_STRING, the key is considered to be Unicode (UChar*).
367
368add_rt_assoc_zval(zval *arr, char *key, zval *val);
369add_rt_assoc_zval_ex(zval *arr, char *key, int key_len, zval *val);
370 . When UG(unicode) is off, these behave identically to their
371 add_assoc_zval() and add_assoc_zval_ex() counterparts.
372 When UG(unicode) is on, key is converted to Unicode using runtime encoding.
373
374add_ascii_assoc_zval(zval *arr, char *key, zval *val);
375add_ascii_assoc_zval_ex(zval *arr, char *key, int key_len, zval *val);
376 . When UG(unicode) is off, these behave identically to their
377 add_assoc_zval() and add_assoc_zval_ex() counterparts.
378 When UG(unicode) is on, key is converted to Unicode using an ASCII converter.
379
380add_utf8_assoc_zval(zval *arr, char *key, zval *val);
381add_utf8_assoc_zval_ex(zval *arr, char *key, int key_len, zval *val);
382 . When UG(unicode) is off, these behave identically to their
383 add_assoc_zval() and add_assoc_zval_ex() counterparts.
384 When UG(unicode) is on, key is converted to Unicode using a UTF8 converter.
385
386
387Keytype and Valuetype specification may be mixed in any combination, for example:
388add_utf8_assoc_ascii_stringl_ex(zval *arr, char *key, int key_len, char *val, int val_len, int dup);
389
390
391Miscellaneous
392-------------
393
394UBYTES() macro can be used to obtain the number of bytes necessary to store
395the given number of UChar's. The typical usage is:
396
397 char *constant_name = colon + (UG(unicode)?UBYTES(2):2);
398
399
400Code Points and Code Units
401--------------------------
402
403Unicode type strings are in the UTF-16 encoding where 1 Unicode character
404may be represented by 1 or 2 UChar's. Each UChar is referred to as a "code
405unit", and a full Unicode character as a "code point". Consequently, number
406of code units and number of code points for the same Unicode string may be
407different. This has many implications, the most important of which is that
408you cannot simply index the UChar* string to get the desired codepoint.
409
410The zval's value.ustr.len contains the number of code units (UChar -- UTF16).
411To obtain the number of code points, one can use u_countChar32() ICU API
412function or Z_USTRCPLEN() macro.
413
414ICU provides a number of macros for working with UTF-16 strings on the
415codepoint level [2]. They allow you to do things like obtain a codepoint at
416random code unit offset, move forward and backward over the string, etc.
417There are two versions of iterator macros, *_SAFE and *_UNSAFE. It is strong
418recommended to use *_SAFE version, since they handle unpaired surrogates and
419check for string boundaries. Here is an example of how to move through
420UChar* string and work on codepoints.
421
422 UChar *str = ...;
423 int32_t str_len = ...;
424 UChar32 codepoint;
425 int32_t offset = 0;
426
427 while (offset < str_len) {
428 U16_NEXT(str, offset, str_len, codepoint);
429 /* now we have the Unicode character in codepoint */
430 }
431
432There is not macro to get a codepoint at a certain code point offset, but
433there is a Zend API function that does it.
434
435 inline UChar32 zend_get_codepoint_at(UChar *str, int32_t length, int32_t n);
436
437To retrieve 3rd codepoint, you would call:
438
439 zend_get_codepoint_at(str, str_len, 3);
440
441If you have a UChar32 codepoint and need to put it into a UChar* string,
442there is another helper function, zend_codepoint_to_uchar(). It takes
443a single UChar32 and converts it to a UChar sequence (1 or 2 UChar's).
444
445 UChar buf[8];
446 UChar32 codepoint = 0x101a2;
447 int8_t num_uchars;
448 num_uchars = zend_codepoint_to_uchar(codepoint, buf);
449
450The return value is the number of resulting UChar's or 0, which indicates
451invalid codepoint.
452
453
454Memory Allocation
455-----------------
456
457For ease of use and to reduce possible bugs, there are memory allocation
458functions specific to Unicode strings. Please use them at all times when
459allocating UChar's.
460
461 eumalloc(size)
462 eurealloc(ptr, size)
463 eustrndup(s, length)
464 eustrdup(s)
465
466 peumalloc(size, persistent)
467 peurealloc(ptr, size, persistent)
468
469The size parameter refers to the number of UChar's, not bytes.
470
471 zend_zstrndup(type, zstr, length)
472
473
474Hashes
475------
476
477Hashes API has been upgraded to work with Unicode and binary strings. All
478hash functions that worked with string keys now have their equivalent
479zend_u_hash_* API. The zend_u_hash_* functions take the type of the key
480string as the second argument.
481
482When UG(unicode) switch is on, the IS_STRING keys are upconverted to
483IS_UNICODE and then used in the hash lookup.
484
485A new HASH_KEY constant has been added for differentiating key types:
486 . HASH_KEY_IS_UNICODE
487
488Note that zend_hash_get_current_key_ex() does not have a zend_u_hash_*
489version. It returns the key as a char* pointer, you can can cast it
490appropriately based on the key type.
491
492
493Identifiers and Class Entries
494-----------------------------
495
496In Unicode mode all the identifiers are Unicode strings. This means that
497while various structures such as zend_class_entry, zend_function, etc store
498the identifier name as a char* pointer, it will actually point to UChar*
499string. Be careful when accessing the names of classes, functions, and such
500-- always check UG(unicode) before using them.
501
502
503Formatted Output
504----------------
505
506Since UTF-16 strings frequently contain NULL bytes, you cannot simpley use
507%s format to print them out. Towards that end, output functions such as
508php_printf(), spprintf(), etc now have three different formats for use with
509Unicode strings:
510
511 %r
512 This format treats the corresponding argument as a Unicode string. The
513 string is automatically converted to the output encoding. If you wish to
514 apply a different converter to the string, use %*r and pass the
515 converter before the string argument.
516
517 UChar *class_name = USTR_NAME("ReflectionClass");
518 zend_printf("%r", class_name);
519 spprintf(&utf8_buffer, 0, "%*r", UG(utf8_conv), class_name);
520
521 %R
522 This format requires at least two arguments: the first one specifies the
523 type of the string to follow (IS_STRING or IS_UNICODE), and the second
524 one - the string itself. If the string is of Unicode type, it is
525 automatically converted to the output encoding. If you wish to apply
526 a different converter to the string, use %*R and pass the converter
527 before the string argument.
528
529 zend_throw_exception_ex(U_CLASS_ENTRY(reflection_exception_ptr), 0 TSRMLS_CC,
530 "Interface %R does not exist",
531 Z_TYPE_P(class_name), Z_UNIVAL_P(class_name));
532
533 %v
534 This format takes only one parameter, the string, but the expected
535 string type depends on the UG(unicode) value. If the string is of
536 Unicode type, it is automatically converted to the output encoding. If
537 you wish to apply a different converter to the string, use %*R and pass
538 the converter before the string argument.
539
540 zend_error(E_WARNING, "%v::__toString() did not return anything",
541 Z_OBJCE_P(object)->name);
542
543 %Z
544 This format prints a zval's value. You can specify the minimum length
545 of the string representation using "%*Z" as well as the absolute length
546 using "%.*Z". The following example is taken from the engine and
547 therefor uses zend_spprintf rather than spprintf. Further more clen is
548 an integer that is smaller than Z_UNILEN_P(callable).
549
550 zend_spprintf(error, 0, "class '%.*Z' not found", clen, callable);
551
552 The function allows to output any kind of zval values, as long as a
553 string (or unicode) conversion is available. Note that printing non
554 string zvals outside of request time is not possible.
555
556Since [v]spprintf() can only output native strings there are also the new
557functions [v]uspprintf() and [v]zspprintf() that create unicode strings and
558return the number of characters printed. That is they return the length rather
559than the byte size. The second pair of functions also takes an additional type
560parameter that allows to create a string of arbitrary type. The following
561example illustrates the use. Assume it fetches a unicode/native string into
562path, path_len and path_type inorder to create sub_name, sub_len and sub_type.
563
564 zstr path, sub_name;
565 int path_len, sub_len;
566 zend_uchar path_type, sub_type;
567
568 /* fetch */
569
570 if (path.v) {
571 sub_type = path_type;
572 sub_len = zspprintf(path_type, &sub_name, 0, "%R%c%s",
573 path_type, path,
574 DEFAULT_SLASH,
575 entry.d_name);
576 }
577
578
579Upgrading Functions
580===================
581
582Let's take a look at a couple of functions that have been upgraded to
583support new string types.
584
585substr()
586--------
587
588This functions returns part of a string based on offset and length
589parameters.
590
591 zstr str;
592 int str_len, cp_len;
593 zend_uchar str_type;
594
595 if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "tl|l", &str, &str_len, &str_type, &f, &l) == FAILURE) {
596 return;
597 }
598
599The first thing we notice is that the incoming string specifier is 't',
600which means that we can accept all 3 string types. The 'str' variable is
601declared as zstr, because it can point to either UChar* or char*.
602The actual type of the incoming string is stored in 'str_type' variable.
603
604 if (str_type == IS_UNICODE) {
605 cp_len = u_countChar32(str.u, str_len);
606 } else {
607 cp_len = str_len;
608 }
609
610If the string is a Unicode one, we cannot rely on the str_len value to tell
611us the number of characters in it. Instead, we call u_countChar32() to
612obtain it.
613
614The next several lines normalize start and length parameters to fit within the
615string. Nothing new here. Then we locate the appropriate segment.
616
617 if (str_type == IS_UNICODE) {
618 int32_t start = 0, end = 0;
619 U16_FWD_N(str.u, end, str_len, f);
620 start = end;
621 U16_FWD_N(str.u, end, str_len, l);
622 RETURN_UNICODEL(str.u + start, end-start, ZSTR_DUPLICATE);
623
624Since codepoint (character) #n is not necessarily at offset #n in Unicode
625strings, we start at the beginning and iterate forward until we have gone
626through the required number of codepoints to reach the start of the segment.
627Then we save the location in 'start' and continue iterating through the number
628of codepoints specified by the offset. Once that's done, we can return the
629segment as a Unicode string.
630
631 } else {
632 RETURN_STRINGL(str.s + f, l, ZSTR_DUPLICATE);
633 }
634
635For native strings, we can return the segment directly.
636
637
638strrev()
639--------
640
641Let's look at strrev() which requires somewhat more complicated upgrade.
642While one of the guidelines for upgrades is that combining sequences are not
643really taken into account during processing -- substr() can break them up,
644for example -- in this case, we actually should be concerned, because
645reversing combining sequence may result in a completely different string. To
646illustrate:
647
648 a (U+0061 LATIN SMALL LETTER A)
649 o (U+006f LATIN SMALL LETTER O)
650 + ' (U+0301 COMBINING ACUTE ACCENT)
651 + _ (U+0320 COMBINING MINUS SIGN BELOW)
652 l (U+006C LATIN SMALL LETTER L)
653
654Reversing this would result in:
655
656 l (U+006C LATIN SMALL LETTER L)
657 + _ (U+0320 COMBINING MINUS SIGN BELOW)
658 + ' (U+0301 COMBINING ACUTE ACCENT)
659 o (U+006f LATIN SMALL LETTER O)
660 a (U+0061 LATIN SMALL LETTER A)
661
662All of a sudden the combining marks are being applied to 'l' instead of 'o'.
663To avoid this, we need to treat combininig sequences as a unit, by checking
664the combining character class of each character with u_getCombiningClass().
665
666strrev() obtains its single argument, a string, and unless the string is of
667Unicode type, processes it exactly as before, simply swapping bytes around.
668For Unicode case, the magic is like this:
669
670 int32_t i, x1, x2;
671 UChar32 ch;
672 UChar *u_s, *u_n, *u_p;
673
674 u_n = eumalloc(Z_USTRLEN_PP(str)+1);
675 u_p = u_n;
676 u_s = Z_USTRVAL_PP(str);
677
678 i = Z_USTRLEN_PP(str);
679 while (i > 0) {
680 U16_PREV(u_s, 0, i, ch);
681 if (u_getCombiningClass(ch) == 0) {
682 u_p += zend_codepoint_to_uchar(ch, u_p);
683 } else {
684 x2 = i;
685 do {
686 U16_PREV(u_s, 0, i, ch);
687 } while (u_getCombiningClass(ch) != 0);
688 x1 = i;
689 while (x1 <= x2) {
690 U16_NEXT(u_s, x1, Z_USTRLEN_PP(str), ch);
691 u_p += zend_codepoint_to_uchar(ch, u_p);
692 }
693 }
694 }
695 *u_p = 0;
696
697The basic idea is to walk the string backwards from the end, using
698U16_PREV() macro. If the combining class of the current character is 0,
699meaning it's a base character and not a combining mark, we simply append it
700to the new string. Otherwise, we save the location of the index and do a run
701over the characters until we get to the next one with combining class 0. At
702that point we append the sequence as is, without reversing, to the new
703string. Voila.
704
705Note that the code uses zend_codepoint_to_uchar() to convert full Unicode
706characters (UChar32 type) to 1 or 2 UTF-16 code units (UChar type).
707
708
709realpath()
710----------
711
712Filenames use their own converter as it's not uncommon, for example,
713to need to access files on a filesystem with latin1 entries while outputting
714UTF8 runtime content.
715
716The most common approach to parsing filenames can be found in realpath():
717
718zval **ppfilename;
719char *filename;
720int filename_len;
721
722if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "Z", &ppfilename) == FAILURE ||
723 php_stream_path_param_encode(ppfilename, &filename, &filename_len, REPORT_ERRORS, FG(default_context)) == FAILURE) {
724 return;
725}
726
727Here, the filename is taken first as a generic zval**, then converted (separating if necessary)
728and populated into local char* and int storage. The filename will be converted according to
729unicode.filesystem_encoding unless the wrapper specified overrides this with its own conversion
730function (The http:// wrapper, for example, enforces utf8 conversion).
731
732
733rmdir()
734-------
735
736If the function accepts a context parameter, then this context should be used in place of FG(default_context)
737
738zval **ppdir, *zcontext = NULL;
739char *dir;
740int dir_len;
741
742if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "Z|r", &ppdir, &zcontext) == FAILURE) {
743 return;
744}
745
746context = php_stream_context_from_zval(zcontext, 0);
747if (php_stream_path_param_encode(ppdir, &dir, &dir_len, REPORT_ERRORS, context) == FAILURE) {
748 return;
749}
750
751
752sqlite_query()
753--------------
754
755If the function's underlying library expects a particular encoding (i.e. UTF8), then the alternate form of
756the string parameter may be used with zend_parse_parameters().
757
758char *sql;
759int sql_len;
760
761if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s&", &sql, &sql_len, UG(utf8_conv)) == FAILURE) {
762 return;
763}
764
765Converters
766==========
767
768Standard Converters
769-------------------
770
771The following converters (UConverter*) are initialized by Zend and are always available (regardless of UG(unicode) mode):
772 UG(utf8_conv)
773 UG(ascii_conv)
774 UG(fallback_encoding_conv) - UTF8 unless overridden by INI setting unicode.fallback_encoding
775
776Additional converters will be optionally initialized depending on INI settings:
777 UG(runtime_encoding_conv) - unicode.runtime_encoding
778 . Unicode output generated by a script will be encoding using this converter
779
780 UG(script_encoding_conv) - unicode.script_encoding
781 . Scripts read from disk will be decoded using this converter
782
783 UG(filesystem_encoding_conv) - unicode.filesystem_encoding
784 . Filenames and paths will be encoding using this converter
785
786
787Since these additional converters may not be instatiated (because their INI value is not set), all uses of these converters must
788be wrapped in ZEND_U_CONVERTER() for safety. If the converter hasn't been instantiated, then UG(fallback_encoding_conv) will be
789used instead.
790
791For example, RETURN_RT_STRING("foo", ZSTR_DUPLICATE); expands out to:
792 RETURN_U_STRING(ZEND_U_CONVERTER(UG(runtime_encoding_conv)), "foo", ZSTR_DUPLICATE);
793
794Which uses UG(runtime_encoding_conv) if it's been set, otherwise using UG(fallback_encoding_conv).
795
796Note that the INI setting unicode.stream_encoding does not instantiate a UConverter* automatically for use by the process/thread,
797it stores the value as a string for use during fopen() style calls where a UConverter* is instantiated for that particular stream.
798
799References
800==========
801
802[1] http://icu.sourceforge.net/apiref/icu4c/ustring_8h.html#a1
803
804[2] http://icu.sourceforge.net/apiref/icu4c/utf16_8h.html
805
806vim: set et ai tw=76 fo=tron21:
807
README.UNIX-BUILD-SYSTEM
1PHP Build System V5 Overview
2
3- supports Makefile.ins during transition phase
4- not-really-portable Makefile includes have been eliminated
5- supports separate build directories without VPATH by using
6 explicit rules only
7- does not waste disk-space/CPU-time for building temporary libraries
8 => especially noticeable on slower systems
9- slow recursive make replaced with one global Makefile
10- eases integration of proper dependencies
11- adds PHP_DEFINE(what[, value]) which creates a single include-file
12 per what. This will allow more fine-grained dependencies.
13- abandoning the "one library per directory" concept
14- improved integration of the CLI
15- several new targets
16 build-modules: builds and copies dynamic modules into modules/
17 install-cli: installs the CLI only, so that the install-sapi
18 target does only what its name says
19- finally abandoned automake (still requires aclocal at this time)
20- changed some configure-time constructs to run at buildconf-time
21- upgraded shtool to 1.5.4
22- removed $(moduledir) (use EXTENSION_DIR)
23
24The Reason For a New System
25
26It became more and more apparent that there is a severe need
27for addressing the portability concerns and improving the chance
28that your build is correct (how often have you been told to
29"make clean"? When this is done, you won't need to anymore).
30
31
32If You Build PHP on a Unix System
33
34
35You, as a user of PHP, will notice no changes. Of course, the build
36system will be faster, look better and work smarter.
37
38
39
40If You Are Developing PHP
41
42
43
44
45Extension developers:
46
47Makefile.ins are abandoned. The files which are to be compiled
48are specified in the config.m4 now using the following macro:
49
50PHP_NEW_EXTENSION(foo, foo.c bar.c baz.cpp, $ext_shared)
51
52E.g. this enables the extension foo which consists of three source-code
53modules, two in C and one in C++. And, depending on the user's wishes,
54the extension will even be built as a dynamic module.
55
56The full syntax:
57
58PHP_NEW_EXTENSION(extname, sources [, shared [,sapi_class[, extra-cflags]]])
59
60Please have a look at acinclude.m4 for the gory details and meanings
61of the other parameters.
62
63And that's basically it for the extension side.
64
65If you previously built sub-libraries for this module, add
66the source-code files here as well. If you need to specify
67separate include directories, do it this way:
68
69PHP_NEW_EXTENSION(foo, foo.c mylib/bar.c mylib/gregor.c,,,-I@ext_srcdir@/lib)
70
71E.g. this builds the three files which are located relative to the
72extension source directory and compiles all three files with the
73special include directive (@ext_srcdir@ is automatically replaced).
74
75Now, you need to tell the build system that you want to build files
76in a directory called $ext_builddir/lib:
77
78PHP_ADD_BUILD_DIR($ext_builddir/lib)
79
80Make sure to call this after PHP_NEW_EXTENSION, because $ext_builddir
81is only set by the latter.
82
83If you have a complex extension, you might to need add special
84Make rules. You can do this by calling PHP_ADD_MAKEFILE_FRAGMENT
85in your config.m4 after PHP_NEW_EXTENSION.
86
87This will read a file in the source-dir of your extension called
88Makefile.frag. In this file, $(builddir) and $(srcdir) will be
89replaced by the values which are correct for your extension
90and which are again determined by the PHP_NEW_EXTENSION macro.
91
92Make sure to prefix *all* relative paths correctly with either
93$(builddir) or $(srcdir). Because the build system does not
94change the working directory anymore, we must use either
95absolute paths or relative ones to the top build-directory.
96Correct prefixing ensures that.
97
98
99SAPI developers:
100
101Instead of using PHP_SAPI=foo/PHP_BUILD_XYZ, you will need to type
102
103PHP_SELECT_SAPI(name, type, sources.c)
104
105I.e. specify the source-code files as above and also pass the
106information regarding how PHP is supposed to be built (shared
107module, program, etc).
108
109For example for APXS:
110
111PHP_SELECT_SAPI(apache, shared, sapi_apache.c mod_php5.c php_apache.c)
112
113
114
115General info
116
117The foundation for the new system is the flexible handling of
118sources and their contexts. With the help of macros you
119can define special flags for each source-file, where it is
120located, in which target context it can work, etc.
121
122Have a look at the well documented macros
123PHP_ADD_SOURCES(_X) in acinclude.m4.
124
README.UPDATING_TO_PHP6
1Updating your script to PHP6
2============================
3
4This document attempts to describe portions of PHP that changed or
5disapeared in PHP6 and the best practices for upgrading existing
6applications to support PHP6.
7
81. Language
9 1.1 Functions and function aliases
10 1.2 Register globals
11 1.3 Magic quotes
12 1.4 Register long arrays ($HTTP_*_VARS)
13 1.5 ZE1 compatibility mode
14 1.6 dl() function
15 1.7 E_ALL and E_STRICT constants
16 1.8 References
172. Unicode (see README.UNICODE-UPGRADES)
182. Extensions
192.1 GD
20
21
221.1 Functions and function aliases
23 ------------------------------
24
25<TODO: List all arguments order changes, aliases droped in php6...>
261.2 Register globals
27 ----------------
28
29For security reasons, register_globals has been removed from php6.
30ini_get('register_globals') will always return false.
31
32You can emulate its behavior with some minimum changes in your code.
33
34*DISCLAIMER*
35people should get a short-term solution if they are willing to run
36an insecure app.
37
38Here is an example to emulate the session related functions and
39a snippet to register variables:
40
41$_register_globals_order = strrev(ini_get("variables_order"));
42$_register_globals_order_len = strlen($_register_globals_order);
43
44for($_register_globals_i=0;$_register_globals_i<$_register_globals_order_len;$_register_globals_i++) {
45 switch($_register_globals_order{$_register_globals_i}) {
46 case "E":
47 extract($_ENV, EXTR_REFS|EXTR_SKIP);
48 break;
49
50 case "G":
51 extract($_GET, EXTR_REFS|EXTR_SKIP);
52 break;
53
54 case "P":
55 extract($_POST, EXTR_REFS|EXTR_SKIP);
56 break;
57
58 case "C":
59 extract($_COOKIE, EXTR_REFS|EXTR_SKIP);
60 break;
61
62 case "S":
63 extract($_SERVER, EXTR_REFS|EXTR_SKIP);
64 break;
65 }
66}
67unset($_register_globals_order, $_register_globals_order_len, $_register_globals_i);
68
69function session_register($mixed) {
70 static $started;
71 if(!isset($started) || session_id() === "") {
72 session_start();
73 $started = true;
74 }
75
76 $array = func_get_args();
77 foreach($array as $mixed) {
78
79 if(is_scalar($mixed)) {
80 $_SESSION[$mixed] =& $GLOBALS[$mixed];
81 }
82 elseif(is_array($mixed)) {
83 foreach($mixed as $name) {
84 $ok = session_register($name);
85 if(!$ok) {
86 return false;
87 }
88 }
89 }
90 else {
91 return false;
92 }
93 }
94 return true;
95}
96
97function session_is_registered($name) {
98 if(is_scalar($name)) {
99 return isset($_SESSION[$name]);
100 }
101 return false;
102}
103
104function session_unregister($name) {
105 if(isset($_SESSION[$name]) && is_scalar($name)) {
106 unset($_SESSION[$name]);
107 return true;
108 }
109 return false;
110}
111
1121.3 Magic quotes
113 ------------
114
1151.4 Register long arrays ($HTTP_*_VARS)
116 -----------------------------------
117
118register_long_arrays and the long versions of super globals had been removed.
119PHP will emit E_CORE_ERROR during PHP startup if it would detect
120register_long_arrays setting.
121
122You can emulate long arrays by including the following file:
123
124<?php
125if (!ini_get('register_long_arrays')) {
126 $HTTP_POST_VARS =& $_POST;
127 $HTTP_GET_VARS =& $_GET;
128 $HTTP_COOKIE_VARS =& $_COOKIE;
129 $HTTP_SERVER_VARS =& $_SERVER;
130 $HTTP_ENV_VARS =& $_ENV;
131 $HTTP_POST_FILES =& $_FILES;
132}
133?>
134
1351.5 ZE1 compatibility mode
136 ----------------------
137
138ZE1 compatibility mode (PHP4 object model) was introduced to migrate from PHP4
139to PHP5 in an easier way, but it never keeped 100% compatibility.
140It is completly removed in PHP6, and there is no way to emulate it.
141Applications should assume PHP5/PHP6 object model.
142
1431.6 dl() function
144 -------------
145
146Now dl() function is supported only in CLI, CGI and EMBED SAPI.
147There is no way to emulte it. You can just check if dl() is supported by SAPI:
148
149<?php
150if (!function_exists("dl")) {
151 die("dl() function is required\n!");
152}
153?>
154
1551.7 E_ALL and E_STRICT constants
156 ----------------------------
157
158Now E_ALL error reporting mask includes E_STRICT.
159You can filter E_STRICT error messages using the following code:
160
161<?php
162error_reporting(error_reporting() & ~E_STRICT);
163?>
164
1651.8 References
166 ----------
167
168<TODO: Derick plans to clean the reference mess in php6>
169
1702.1 GD
171
172<TODO: gd2/ft2 only, functions droped>
173
README.WIN32-BUILD-SYSTEM
README.Zeus
1Using PHP 5 with the Zeus Web Server
2-----------------------------------
3
4Zeus fully supports running PHP in combination with our
5webserver. There are three different interfaces that can be used to
6enable PHP:
7
8* CGI
9* ISAPI
10* FastCGI
11
12Of the three, we recommend using FastCGI, which has been tested and
13benchmarked as providing the best performance and reliability.
14
15Full details of how to install PHP are available from our
16website, at:
17
18http://support.zeus.com/products/php.html
19
20If you have any problems, please check the support site for more
21up-to-date information and advice.
22
23
24Quick guide to installing CGI/FastCGI with Zeus
25-----------------------------------------------
26
27Step 1 - Compile PHP as FastCGI.
28
29Compile as follows:
30 ./configure --enable-fastcgi
31 make
32
33Note that PHP has many options to the configure script -
34e.g. --with-mysql. You will probably want to select your usual options
35before compiling; the above is just a bare minimum, for illustration.
36
37After compilation finishes, you will be left with an executable
38program called 'php'. Copy this into your document root, under a
39dedicated FastCGI directory (e.g. $DOCROOT/fcgi-bin/php)
40
41
42Step 2 - configure Zeus
43
44Four stages:
45 - enable FastCGI
46 - configure FastCGI
47 - setup alias for FastCGI
48 - setup alias for PHP
49
501) Using the admin server, go to the 'module configuration' page for
51your virtual server, and ensure that 'fastcgi' is enabled (select the
52tickbox to the left).
53
542) While we can run FastCGI's locally, there are known problems with
55some OS's (specifically, the communication between web server and
56FastCGI happens over a unix domain socket, and some OS's have trouble
57sustaining high connection rates over these sockets). So instead, we
58are going to set up the PHP FastCGI to run 'remotely' over localhost
59(this uses TCP sockets, which do not suffer this problem). Go to the
60'fastcgi configuration' page, and under 'add remote fastcgi':
61 Add Remote FastCGI
62 Docroot path /fcgi-bin/php
63 Remote machine localhost:8002
64The first entry is where you saved PHP, above.
65The second entry is localhost:<any unused port>
66We will start the FastCGI listening on this port shortly.
67Click 'update' to commit these changes.
68
693) Go to the path mapping module and add an alias for FastCGI:
70 Add Alias
71 Docroot path /fcgi-bin
72 Filesystem directory /path/to/docroot/fcgi-bin
73 Alias type fastcgi
74Click 'update' to commit these changes
75
764) Also on the path mapping module, add a handler for PHP:
77 Add handler
78 File extension php
79 Handler /fcgi-bin/php
80Click 'update' to commit these changes
81
82Finally restart your virtual server for these changes to take effect.
83
84
85Step 3 - start PHP as a FastCGI runner
86
87When you start PHP, it will pre-fork a given number of child processes
88to handle incoming PHP requests. Each process will handle a given
89number of requests before exiting (and being replaced by a newly
90forked process). You can control these two parameters by setting the
91following environment variables BEFORE starting the FastCGI runner:
92
93PHP_FCGI_CHILDREN - the number of child processes to pre-fork. This
94variable MUST be set, if not then the PHP will not run as a FastCGI.
95We recommend a value of 8 for a fairly busy site. If you have many,
96long-running PHP scripts, then you may need to increase this further.
97
98PHP_FCGI_MAX_REQUESTS - the number of requests each PHP child process
99handles before exiting. If not set, defaults to 500.
100
101To start the FastCGI runner, execute '$ZEUSHOME/web/bin/fcgirunner
1028002 $DOCROOT/fcgi-bin/php'. Substitute the appropriate values for
103$ZEUSHOME and $DOCROOT; also substitute for 8002 the port you chose,
104above.
105
106To stop the runner (e.g. to experiment with the above environment
107variables) you will need to manually stop and running PHP
108processes. (Use 'ps' and 'kill'). As it is PHP which is forking lots
109of children and not the runner, Zeus unfortunately cannot keep track
110of what processes are running, sorry. A typical command line may look
111like 'ps -efl | grep $DOCROOT/fcgi-bin/php | grep -v grep | awk
112'{print $4}' | xargs kill'
113