New subject: [ruby-core:114046] [Ruby master Feature#19744] Namespace on read

27 Jun 2023

Issue #19744 has been reported by tagomoris (Satoshi TAGOMORI).

----------------------------------------
Feature #19744: Namespace on read
https://bugs.ruby-lang.org/issues/19744

* Author: tagomoris (Satoshi TAGOMORI)
* Status: Open
* Priority: Normal
----------------------------------------
# What is the "Namespace on read"

This proposes a new feature to define virtual top-level namespaces in Ruby. Those
namespaces can require/load libraries (either .rb or native extension) separately from the
global namespace. Dependencies of required/loaded libraries are also required/loaded in
the namespace.

### Motivation

The "namespace on read" can solve the 2 problems below, and can make a path to
solve another problem:
The details of those motivations are described in the below section ("Motivation
details").

#### Avoiding name conflicts between libraries

Applications can require two different libraries safely which use the same module name.

#### Avoiding unexpected globally shared modules/objects

Applications can make an independent/unshared module instance.

#### (In the future) Multiple versions of gems can be required

Application developers will have fewer version conflicts between gem dependencies if
rubygems/bundler will support the namespace on read.

### Example code with this feature

```ruby
# your_module
module YourModule
end

# my_module.rb
require 'your_module'

module MyModule
end

# example.rb
namespace1 = NameSpace.new
namespace1.require('my_module') #=> true

namespace1::MyModule #=> #<Module:0x00000001027ea650>::MyModule (or
#<NameSpace:0x00...>::MyModule ?)
namespace1::YourModule # similar to the above

MyModule # NameError
YourModule # NameError

namespace2 = NameSpace.new      # Any number of namespaces can be defined
namespace2.require('my_module') # Different library "instance" from
namespace1

require 'my_module' # require in the global namespace

MyModule.object_id != namespace1::MyModule.object_id #=> true
namespace1::MyModule.object_id != namespace2::MyModule.object_id
```

The required/loaded libraries will define different "instances" of
modules/classes in those namespaces (just like the "wrapper" 2nd argument of
`Kernel.load`). This doesn't introduce compatibility problems if all libraries use
relative name resolution (without forced top-level reference like `::Name`).

# "On read": optional, user-driven feature

"On read" is a key thing of this feature. That means:

* No changes are required in existing/new libraries (except for limited cases, described
below)
* No changes are required in applications if it doesn't need namespaces
* Users can enable/use namespaces just for limited code in the whole library/application

Users can start using this feature step by step (if they want it) without any big jumps.

## Motivation details

This feature can solve multiple problems I have in writing/executing Ruby code. Those are
from the 3 problems I mentioned above: name conflicts, globally shared modules, and
library version conflicts between dependencies. I'll describe 4 scenarios about those
problems.

### Running multiple applications on a Ruby process

Modern computers have many CPU cores and large memory spaces. We sometimes want to have
many separate applications (either micro-service architecture or modular monolith).
Currently, running those applications require different processes. It requires additional
computation costs (especially in developing those applications).

If we have isolated namespaces and can load applications in those namespaces, we'll be
able to run apps on a process, with less overhead.

(I want to run many AWS Lambda applications on a process in isolated namespaces.)

### Running tests in isolated namespaces

Tests that require external libraries need many hacks to:

* require a library multiple times
* require many different 3rd party libraries into isolated spaces (those may conflict with
each other)

Software with plugin systems (for example, Fluentd) will get benefit from namespaces.

In addition to it, application tests can avoid unexpected side effects if tests are
executed in isolated namespaces.

### Safely isolated library instances

Libraries may have globally shared states. For example,
[Oj](https://github.com/ohler55/oj) has a global `Obj.default_options` object to change
the library behavior. Those options may be changed by any dependency libraries or
applications, and it changes the behavior of `Oj` globally, unexpectedly.

For such libraries, we'll be able to instantiate a safe library instance in an
isolated namespace.

### Avoiding dependency hells

Modern applications use many libraries, and those libraries require much more
dependencies. Those dependencies will cause version conflicts very often. In such cases,
application developers should resolve those by updating each libraries, or should just
wait for the new release of libraries to conflict those libraries. Sometimes, library
maintainers don't release updated versions, and application developers can do
nothing.

If namespaces can require/load a library multiple times, it also enables to require/load
different versions of a library in a process. It requires the support of rubygems, but
namespaces should be a good fundamental of it.

## Expected problems

### Use of top-level references

In my expectation, `::Name` should refer the top-level `Name` in the global namespace. I
expect that `::ENV` should contain the environment variables. But it may cause
compatibility problems if library code uses `::MyLibrary` to refer themselves in their
deeply nested library code.

### Additional memory consumption

An extension library (dynamically linked library) may be loaded multiple times (by
`dlopen` for temporarily copied dll files) to load isolated library "instances"
if different namespaces require the same extension library. That consumes additional
memory.

In my opinion, additional memory consumption is a minimum cost to realize loading
extension libraries multiple times without compatibility issues.

This occurs only when programmers use namespaces. And it's only about libraries that
are used in 2 or more namespaces.

### The change of `dlopen` flag about extension libraries

To load an extension library multiple times without conflicting symbols, all extensions
should stop sharing symbols globally. Libraries referring symbols from other extension
libraries will have to change code & dependencies.

(About the things about extension libraries, [Naruse also wrote an
entry](https://naruse.hateblo.jp/entry/2023/05/22/193411).)

# Misc

The proof-of-concept branch is here: https://github.com/tagomoris/ruby/pull/1
It's still work-in-progress branch, especially for extension libraries.

-- 
https://bugs.ruby-lang.org/

[ruby-core:114025] [Ruby master Feature#19744] Namespace on read