class PublicSuffix::List

A {PublicSuffix::List} is a collection of one or more {PublicSuffix::Rule}.

Given a {PublicSuffix::List}, you can add or remove {PublicSuffix::Rule}, iterate all items in the list or search for the first rule which matches a specific domain name.

# Create a new list
list =  PublicSuffix::List.new

# Push two rules to the list
list << PublicSuffix::Rule.factory("it")
list << PublicSuffix::Rule.factory("com")

# Get the size of the list
list.size
# => 2

# Search for the rule matching given domain
list.find("example.com")
# => #<PublicSuffix::Rule::Normal>
list.find("example.org")
# => nil

You can create as many {PublicSuffix::List} you want. The {PublicSuffix::List.default} rule list is used to tokenize and validate a domain.

{PublicSuffix::List} implements Enumerable module.

Constants

DEFAULT_LIST_PATH

Attributes

rules[R]

Gets the array of rules.

@return [Array<PublicSuffix::Rule::*>]

Public Class Methods

clear() click to toggle source

Sets the default rule list to nil.

@return [self]

# File lib/public_suffix/list.rb, line 68
def self.clear
  self.default = nil
  self
end
default(**options) click to toggle source

Gets the default rule list.

Initializes a new {PublicSuffix::List} parsing the content of {PublicSuffix::List.default_list_content}, if required.

@return [PublicSuffix::List]

# File lib/public_suffix/list.rb, line 51
def self.default(**options)
  @default ||= parse(File.read(DEFAULT_LIST_PATH), options)
end
default=(value) click to toggle source

Sets the default rule list to value.

@param [PublicSuffix::List] value

The new rule list.

@return [PublicSuffix::List]

# File lib/public_suffix/list.rb, line 61
def self.default=(value)
  @default = value
end
new() { |self| ... } click to toggle source

Initializes an empty {PublicSuffix::List}.

@yield [self] Yields on self. @yieldparam [PublicSuffix::List] self The newly created instance.

# File lib/public_suffix/list.rb, line 126
def initialize
  @rules = []
  yield(self) if block_given?
  reindex!
end
parse(input, private_domains: true) click to toggle source

Parse given input treating the content as Public Suffix List.

See publicsuffix.org/format/ for more details about input format.

@param string [#each_line] The list to parse. @param private_domain [Boolean] whether to ignore the private domains section. @return [Array<PublicSuffix::Rule::*>]

# File lib/public_suffix/list.rb, line 82
def self.parse(input, private_domains: true)
  comment_token = "//".freeze
  private_token = "===BEGIN PRIVATE DOMAINS===".freeze
  section = nil # 1 == ICANN, 2 == PRIVATE

  new do |list|
    input.each_line do |line|
      line.strip!
      case # rubocop:disable Style/EmptyCaseCondition

      # skip blank lines
      when line.empty?
        next

      # include private domains or stop scanner
      when line.include?(private_token)
        break if !private_domains
        section = 2

      # skip comments
      when line.start_with?(comment_token)
        next

      else
        list.add(Rule.factory(line, private: section == 2), reindex: false)

      end
    end
  end
end

Public Instance Methods

<<(rule, reindex: true)
Alias for: add
==(other) click to toggle source

Checks whether two lists are equal.

List one is equal to two, if two is an instance of {PublicSuffix::List} and each PublicSuffix::Rule::* in list one is available in list two, in the same order.

@param [PublicSuffix::List] other

The List to compare.

@return [Boolean]

# File lib/public_suffix/list.rb, line 166
def ==(other)
  return false unless other.is_a?(List)
  equal?(other) || rules == other.rules
end
Also aliased as: eql?
add(rule, reindex: true) click to toggle source

Adds the given object to the list and optionally refreshes the rule index.

@param [PublicSuffix::Rule::*] rule

The rule to add to the list.

@param [Boolean] reindex

Set to true to recreate the rule index
after the rule has been added to the list.

@return [self]

@see reindex!

# File lib/public_suffix/list.rb, line 190
def add(rule, reindex: true)
  @rules << rule
  reindex! if reindex
  self
end
Also aliased as: <<
clear() click to toggle source

Removes all elements.

@return [self]

# File lib/public_suffix/list.rb, line 214
def clear
  @rules.clear
  reindex!
  self
end
default_rule() click to toggle source

Gets the default rule.

@see PublicSuffix::Rule.default_rule

@return [PublicSuffix::Rule::*]

# File lib/public_suffix/list.rb, line 281
def default_rule
  PublicSuffix::Rule.default
end
each(*args, &block) click to toggle source

Iterates each rule in the list.

# File lib/public_suffix/list.rb, line 173
def each(*args, &block)
  @rules.each(*args, &block)
end
empty?() click to toggle source

Checks whether the list is empty.

@return [Boolean]

# File lib/public_suffix/list.rb, line 207
def empty?
  @rules.empty?
end
eql?(other)
Alias for: ==
find(name, default: default_rule, **options) click to toggle source

Finds and returns the most appropriate rule for the domain name.

From the Public Suffix List documentation:

  • If a hostname matches more than one rule in the file, the longest matching rule (the one with the most levels) will be used.

  • An exclamation mark (!) at the start of a rule marks an exception to a previous wildcard rule. An exception rule takes priority over any other matching rule.

## Algorithm description

  1. Match domain against all rules and take note of the matching ones.

  2. If no rules match, the prevailing rule is “*”.

  3. If more than one rule matches, the prevailing rule is the one which is an exception rule.

  4. If there is no matching exception rule, the prevailing rule is the one with the most labels.

  5. If the prevailing rule is a exception rule, modify it by removing the leftmost label.

  6. The public suffix is the set of labels from the domain which directly match the labels of the prevailing rule (joined by dots).

  7. The registered domain is the public suffix plus one additional label.

@param name [String, to_s] The domain name. @param [PublicSuffix::Rule::*] default The default rule to return in case no rule matches. @return [PublicSuffix::Rule::*]

# File lib/public_suffix/list.rb, line 243
def find(name, default: default_rule, **options)
  rule = select(name, **options).inject do |l, r|
    return r if r.class == Rule::Exception
    l.length > r.length ? l : r
  end
  rule || default
end
indexes() click to toggle source

Gets the naive index, a hash that with the keys being the first label of every rule pointing to an array of integers (indexes of the rules in @rules).

# File lib/public_suffix/list.rb, line 151
def indexes
  @indexes.dup
end
reindex!() click to toggle source

Creates a naive index for +@rules+. Just a hash that will tell us where the elements of +@rules+ are relative to its first {PublicSuffix::Rule::Base#labels} element.

For instance if @rules and @rules are the only elements of the list where Rule#labels.first is 'us' @indexes #=> [5,4], that way in select we can avoid mapping every single rule against the candidate domain.

# File lib/public_suffix/list.rb, line 140
def reindex!
  @indexes = {}
  @rules.each_with_index do |rule, index|
    tld = Domain.name_to_labels(rule.value).last
    @indexes[tld] ||= []
    @indexes[tld] << index
  end
end
select(name, ignore_private: false) click to toggle source

Selects all the rules matching given domain.

Internally, the lookup heavily rely on the `@indexes`. The input is split into labels, and we retriever from the index only the rules that end with the input label. After that, a sequential scan is performed. In most cases, where the number of rules for the same label is limited, this algorithm is efficient enough.

If `ignore_private` is set to true, the algorithm will skip the rules that are flagged as private domain. Note that the rules will still be part of the loop. If you frequently need to access lists ignoring the private domains, you should create a list that doesn't include these domains setting the `private_domains: false` option when calling {.parse}.

@param [String, to_s] name The domain name. @param [Boolean] ignore_private @return [Array<PublicSuffix::Rule::*>]

# File lib/public_suffix/list.rb, line 266
def select(name, ignore_private: false)
  name = name.to_s
  indices = (@indexes[Domain.name_to_labels(name).last] || [])

  finder = @rules.values_at(*indices).lazy
  finder = finder.select { |rule| rule.match?(name) }
  finder = finder.select { |rule| !rule.private } if ignore_private
  finder.to_a
end
size() click to toggle source

Gets the number of elements in the list.

@return [Integer]

# File lib/public_suffix/list.rb, line 200
def size
  @rules.size
end