Nokogiri v1.15.0 has been released!
This is primarily a feature release, with some small bugfixes.
Full release notes are at
https://github.com/sparklemotion/nokogiri/releases/tag/v1.15.0 but they are
included below for your convenience.
---
Nokogiri (鋸) makes it easy and painless to work with XML and HTML from
Ruby. It provides a sensible, easy-to-understand API for reading, writing,
modifying, and querying documents. It is fast and standards-compliant by
relying on native parsers like libxml2, libgumbo, or xerces.
---
## 1.15.0 / 2023-05-15
### Notes
#### Ability to opt into system `malloc` and `free`
Since 2009, Nokogiri has configured libxml2 to use `ruby_xmalloc` et al for
memory management. This has provided benefits for memory management, but
comes with a performance penalty.
Users can now opt into using system `malloc` for libxml2 memory management
by setting an environment variable:
``` sh
# "default" here means "libxml2's default" which is system malloc
NOKOGIRI_LIBXML_MEMORY_MANAGEMENT=default
```
Benchmarks show that this setting will significantly improve performance,
but be aware that the tradeoff may involve poorer memory management
including bloated heap sizes and/or OOM conditions.
You can read more about this in the decision record at
[`adr/2023-04-libxml-memory-management.md`](adr/2023-04-libxml-memory-management.md).
### Dependencies
* [CRuby] Vendored libxml2 is updated to v2.11.3 from v2.10.4. For details
please see:
* https://gitlab.gnome.org/GNOME/libxml2/-/releases/v2.11.0
* https://gitlab.gnome.org/GNOME/libxml2/-/releases/v2.11.1
* https://gitlab.gnome.org/GNOME/libxml2/-/releases/v2.11.2
* https://gitlab.gnome.org/GNOME/libxml2/-/releases/v2.11.3
* [CRuby] Vendored libxslt is updated to v1.1.38 from v1.1.37. For details
please see:
* https://gitlab.gnome.org/GNOME/libxslt/-/releases/v1.1.38
### Added
* `Encoding` objects may now be passed to serialization methods like
`#to_xml`, `#to_html`, `#serialize`, and `#write_to` to specify the output
encoding. Previously only encoding names (strings) were accepted. [[#2774](
https://github.com/sparklemotion/nokogiri/issues/2774), [#2798](
https://github.com/sparklemotion/nokogiri/issues/2798)] (Thanks,
[@ellaklara](https://github.com/ellaklara)!)
* [CRuby] Users may opt into using system `malloc` for libxml2 memory
management. For more detail, see note above or
[`adr/2023-04-libxml-memory-management.md`](adr/2023-04-libxml-memory-management.md).
### Changed
* [CRuby] `Schema.from_document` now makes a defensive copy of the document
if it has blank text nodes with Ruby objects instantiated for them. This
prevents unsafe behavior in libxml2 from causing a segfault. There is a
small performance cost, but we think this has the virtue of being "what the
user meant" since modifying the original is surprising behavior for most
users. Previously this was addressed in v1.10.9 by raising an exception.
### Fixed
* [CRuby] `XSLT.transform` now makes a defensive copy of the document if it
has blank text nodes with Ruby objects instantiated for them _and_ the
template uses `xsl:strip-spaces`. This prevents unsafe behavior in libxslt
from causing a segfault. There is a small performance cost, but we think
this has the virtue of being "what the user meant" since modifying the
original is surprising behavior for most users. Previously this would allow
unsafe memory access and potentially segfault. [[#2800](
https://github.com/sparklemotion/nokogiri/issues/2800)]
### Improved
* `Nokogiri::XML::Node::SaveOptions#inspect` now shows the names of the
options set in the bitmask, similar to `ParseOptions`. [[#2767](
https://github.com/sparklemotion/nokogiri/issues/2767)]
* `#inspect` and pretty-printing are improved for `AttributeDecl`,
`ElementContent`, `ElementDecl`, and `EntityDecl`.
* [CRuby] The C extension now uses Ruby's [TypedData API](
https://docs.ruby-lang.org/en/3.0/extension_rdoc.html#label-Encapsulate+C+D…)
for managing all the libxml2 structs. Write barriers may improve GC
performance in some extreme cases. [[#2808](
https://github.com/sparklemotion/nokogiri/issues/2808)] (Thanks,
[@etiennebarrie](https://github.com/etiennebarrie) and [@byroot](
https://github.com/byroot)!)
* [CRuby] `ObjectSpace.memsize_of` reports a pretty good guess of memory
usage when called on `Nokogiri::XML::Document` objects. [[#2807](
https://github.com/sparklemotion/nokogiri/issues/2807)] (Thanks,
[@etiennebarrie](https://github.com/etiennebarrie) and [@byroot](
https://github.com/byroot)!)
* [CRuby] Users installing the "ruby" platform gem and compiling libxml2
and libxslt from source will now be using a modern `config.guess` and
`config.sub` that supports new architectures like `loongarch64`. [[#2831](
https://github.com/sparklemotion/nokogiri/issues/2831)] (Thanks,
[@zhangwenlong8911](https://github.com/zhangwenlong8911)!)
* [CRuby] HTML5 parser:
* adjusts the specified attributes, adding `xlink:arcrole` and removing
`xml:base` [[#2841](https://github.com/sparklemotion/nokogiri/issues/2841),
[#2842](https://github.com/sparklemotion/nokogiri/issues/2842)]
* allows `<hr>` in `<select>` [[whatwg/html#3410](
https://github.com/whatwg/html/issues/3410), [whatwg/html#9124](
https://github.com/whatwg/html/pull/9124)]
* [JRuby] `Node#first_element_child` now returns `nil` if there are only
non-element children. Previously a null pointer exception was raised.
[[#2808](https://github.com/sparklemotion/nokogiri/issues/2808), [#2844](
https://github.com/sparklemotion/nokogiri/issues/2844)]
* Documentation for `Nokogiri::XSLT` now has usage examples including
custom function handlers.
### Deprecated
* Passing a `Nokogiri::XML::Node` as the first parameter to `CDATA.new` is
deprecated and will generate a warning. This parameter should be a kind of
`Nokogiri::XML::Document`. This will become an error in a future version of
Nokogiri.
* Passing a `Nokogiri::XML::Node` as the first parameter to
`Schema.from_document` is deprecated and will generate a warning. This
parameter should be a kind of `Nokogiri::XML::Document`. This will become
an error in a future version of Nokogiri.
* Passing a `Nokogiri::XML::Node` as the second parameter to `Text.new` is
deprecated and will generate a warning. This parameter should be a kind of
`Nokogiri::XML::Document`. This will become an error in a future version of
Nokogiri.
* [CRuby] Calling a custom XPath function without the `nokogiri` namespace
is deprecated and will generate a warning. Support for non-namespaced
functions will be removed in a future version of Nokogiri. (Note that JRuby
has never supported non-namespaced custom XPath functions.)
### Thank you!
The following people and organizations were kind enough to sponsor
@flavorjones or the Nokogiri project during the development of v1.15.0:
* Götz Görisch (@GoetzGoerisch)
* Airbnb (@airbnb)
* Kyohei Nanba (@kyo-nanba)
* Maxime Gauthier (@biximilien)
* @renuo
* @dbootyfvrt
* YOSHIDA Katsuhiko (@kyoshidajp)
* Homebrew (@Homebrew)
* Hiroshi SHIBATA (@hsbt)
* PuLLi (@the-pulli)
* SiteLog GmbH (@sitelog-gmbh)
* @zzak
* Evil Martians (@evilmartians)
* Ajaya Agrawalla (@ajaya)
* Modern Treasury (@Modern-Treasury)
* Danilo Lessa Bernardineli (@danlessa)
We'd also like to thank @github who donate a ton of compute time for our CI
pipelines!
---
sha256 checksums:
```
7dbb717c6abc6b99baa4a4e1586a6de5332513f72a8b3568a69836268c2e1f86
nokogiri-1.15.0-aarch64-linux.gem
a60c373d86a9a181f9ace78793c4a91ab8fa971af3cce93e9fdf022cd808fe41
nokogiri-1.15.0-arm-linux.gem
41d312b2d4aa6b6750c2431a25c1bf25fb567bc1e0a750cf55dd02354967724b
nokogiri-1.15.0-arm64-darwin.gem
51cc8d4d98473d00c0ee18266d146677161b6dd16f8c89cc637db91d47b87c63
nokogiri-1.15.0-java.gem
1b2d92e240d12ac0a27cb0618f52af6c405831fd339a45aaab265cecda1dc6ab
nokogiri-1.15.0-x64-mingw-ucrt.gem
497840b3ed9037095fbdd1bf6f7c63d23efab5bcbb03b89d94a6ac8bcab3eda5
nokogiri-1.15.0-x64-mingw32.gem
5c26427f3cf28d8c1e43f7a7bc58e50298461c7bed5179456b122eefc2b2c5cb
nokogiri-1.15.0-x86-linux.gem
cbf93df1c257693dfe804c01252415ca7cb9d2452d6cebddf7a35a5dbeb3ea12
nokogiri-1.15.0-x86-mingw32.gem
ca6cd6ed08e736063539c4aa7454391dfa4153908342e3d873f5bd9218d6f644
nokogiri-1.15.0-x86_64-darwin.gem
4b28e9151e884c10794e0acf4a6f49db933eee3cd90b20aab952ee0102a03b0c
nokogiri-1.15.0-x86_64-linux.gem
0ca8ea2149bdaaae8db39f11971af86c83923ec58b72c519d498ec44e1dfe97f
nokogiri-1.15.0.gem
```
I have been investigating a test failure triggered by upgrading selenium-webdriver from v4.9.0 to v4.9.1. This has lead me to produce the following test code that demonstrates the problem which occurs as an interaction between capybara and selenium. The test produces the following error
> wrong number of arguments (given 2, expected 0..1) (ArgumentError)
Here is some code that reproduces the error
-----8<-----8<-----8<-----8<-----8<-----8<------8<-----8<-----8<-----
require 'selenium/webdriver/atoms'
require 'selenium/webdriver/common'
require 'selenium/webdriver/version'
module DeprecationSuppressor
def initialize(*)
puts 'in DeprecationSuppressor::initialize'
super
end
end
Selenium::WebDriver::Logger.prepend DeprecationSuppressor
class Demo
def initialize
@logger = Selenium::WebDriver::Logger.new('Selenium', ignored: true)
puts @logger.class
end
end
Demo.new
-----8<-----8<-----8<-----8<-----8<-----8<------8<-----8<-----8<-----
Here is it failing
➜ ruby -v
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]
➜ ruby demo.rb
in DeprecationSuppressor::initialize
/Users/my/.rvm/gems/ruby-3.2.2@test/gems/selenium-webdriver-4.9.1/lib/selenium/webdriver/common/logger.rb:51:in `initialize': wrong number of arguments (given 2, expected 0..1) (ArgumentError)
The error is raised by the `super` call in the initialize method of the DeprecationSuppressor module.
Naming the arguments in the initialize method fixes the problem
e.g.
-----8<-----8<-----8<-----8<-----8<-----8<------8<-----8<-----8<-----
module DeprecationSuppressor
def initialize(*args)
puts 'in DeprecationSuppressor::initialize'
super args
end
end
Selenium::WebDriver::Logger.prepend DeprecationSuppressor
-----8<-----8<-----8<-----8<-----8<-----8<------8<-----8<-----8<-----
➜ ruby demo.rb
in DeprecationSuppressor::initialize
Selenium::WebDriver::Logger
I have also found that using `initialize(...)` works too.
I've tested this on Ruby 3.2.2 and 3.1.4
For further context of the wider problem, here is the initialize method in selenium-webdriver 4.9.0 and 4.9.1
4.9.0: https://github.com/SeleniumHQ/selenium/blob/selenium-4.9.0/rb/lib/selenium/…
4.9.1: https://github.com/SeleniumHQ/selenium/blob/selenium-4.9.1/rb/lib/selenium/…
The difference being that two more named arguments have been added
Finally it looks to me as though the Capybara team are preparing to fix this in v3.9.1 (https://github.com/teamcapybara/capybara/issues/2666)
My question is whether Ruby is behaving correctly? It seems to me that the `super` call in the initialize(*) method should work.
If there's an interest in such posts, I will write more. The idea is
that I will provide examples of lesser-known features of Ruby. Later
on, I will do a longer write-up where I will provide a solution and
explain it unless someone will do it before me :)
Let's say I have a program I can't modify:
```ruby
def gen_proc
secret1 = rand
secret2 = rand
proc do
p secret1 * secret2
nil
end
end
my_proc = gen_proc
binding.irb
```
It will launch an interactive Ruby console. When I type:
```ruby
my_proc.() # .() is a shortcut of .call()
```
It will print some number, but each time I will type this expression,
it will be the same number (until I relaunch the console). This number
is derived from `secret1` and `secret2` variables. Those variables are
stored somewhere - but where? What is the best way for me to obtain
values of those variables? Do note that I can't modify the code :)
Hi all,
Someone was added to a project I worked on long ago earlier today, and
google emailed me after delivery saying the noreply email from rubygems had
been flagged for malware. Has anyone else seen anything similar? Am I a
one-off or is there something wrong?
For those of you (still) interested in building desktop GUI applications with Ruby this might be interesting.
A first public beta of the newly revived wxRuby3 has been released. Like former editions of this project wxRuby3 aims to provide a native extension wrapping the popular C++ wxWidgets GUI framework libraries.
If you are interested check this out:
Github: https://github.com/mcorino/wxruby3
Rubygems: https://rubygems.org/gems/wxruby3
Hi everyone at the ruby talk
I want to bring forward, the art of writing scripts.
Did you ever write a unix script?
ruby was invented to write scripts.
We are still stuck with that %x
zenweb version 3.10.6 has been released!
* home: <https://github.com/seattlerb/zenweb>
* bugs: <https://github.com/seattlerb/zenweb/issues>
* rdoc: <http://docs.seattlerb.org/zenweb>
Zenweb is a set of classes/tools for organizing and formating a
website. It is website oriented rather than webpage oriented, unlike
most rendering tools. It is content oriented, rather than style
oriented, unlike most rendering tools. It uses a rubygems plugin
system to provide a very flexible, and powerful system.
Zenweb 3 was inspired by jekyll. The filesystem layout is similar to
jekyll's layout, but zenweb isn't focused on blogs. It can do any sort
of website just fine.
Zenweb uses rake to handle dependencies. As a result, scanning a
website and regenerating incrementally is not just possible, it is
blazingly fast.
Changes:
### 3.10.6 / 2023-05-04
* 1 bug fix:
* Remove page from parent subpages if being moved via fix_subpages.
The video for the Montreal.rb (Ruby/Rails Meetup) May 2023 talk
"Integrating REST APIs with Microsoft Kiota" has been posted! Microsoft
Kiota is an open-source technology that can automatically generate SDKs for
HTTP REST APIs in Ruby (or any programming language) to save software
engineers from having to write error prone API client code that handles
authentication, authorization, serialization, and exception handling
manually.
https://andymaleh.blogspot.com/2023/05/montrealrb-may-3-2023-integrating-re…
Direct YouTube video link
https://youtu.be/bCox68tBVMw
netsnmp 0.7.0 has been released.
https://github.com/HoneyryderChuck/ruby-netsnmp
<https://github.com/swisscom/ruby-netsnmp>
This gem provides:
- Implementation in ruby of the SNMP Protocol for v3, v2c and v1 (most
notable the rfc3414 and 3826).
- SNMPv3 USM supporting MD5/SHA/SHA224/SHA384/SHA256/SHA512 auth and
DES/AES128/AES192/AES256 privacy crypto algorithms.
- Client/Manager API with simple interface for get, genext, set and walk.
- Support for concurrency and evented I/O.
- Ruby >= 2.1 support (modern)
- Pure Ruby (no FFI)
- Easy PDU debugging
Here are the updates since the last release:
### 0.7.0
#### Features
Added support for SHA224, SHA384, SHA512, AES192, AES256 auth/priv
protocols.
#### Bugfixes
* MIB parser supports `OBJECT-IDENTITY` now.
### 0.6.4
Making the octet string in msgAuthenticationParameters 0-length when no
authentication is to happen (some SNMP implementations are quite strict in
this point).
### 0.6.3
* The `OidNotFound` exception is now raised when a response PDU is received
with an empty value.
### 0.6.2
#### Improvements
* extended use of RBS signatures (thx to `openssl` RBS signatures).
#### Bugfixes
* fixed mib loading when there are no imports.
* fixed default mibs by including subdirs of `/usr/share/snmp/mibs` (in
debian, snmp mibs get loaded under `/iana` and `/ietf`)
### 0.6.1
#### Bugfixes
Removed `MSG_NOSIGNAL` flag from udp socket send calls, given that it's
unnecessary for UDP transactions, it's not defined in all environments,
like Mac OS.
sexp_processor version 4.17.0 has been released!
* home: <https://github.com/seattlerb/sexp_processor>
* rdoc: <http://docs.seattlerb.org/sexp_processor>
sexp_processor branches from ParseTree bringing all the generic sexp
processing tools with it. Sexp, SexpProcessor, Environment, etc... all
for your language processing pleasure.
Changes:
### 4.17.0 / 2023-05-03
* 2 minor enhancements:
* Added Sexp#line_max=.
* Will load strict_sexp if $SP_DEBUG is set.
* 3 bug fixes:
* Sexp#line_max lazy accessor now compacts.
* Sexp#new copies line_max if defined.
* strict_sexp.rb: #first can take an int arg. Fixed mutator wrappers to pass args.