Yann Neuhaus

Subscribe to Yann Neuhaus feed
dbi services technical blog
Updated: 43 min 28 sec ago

A simple Terraform script to create an AWS EC2 playground

9 hours 53 min ago

When I started to write the blog about AWS SSM I quickly realized that I need a way to bring up and destroy my AWS EC2 playground in an easy and reproducible way. There are several options for this: I could have used the AWS command line interface or AWS CloudFormation. Both work pretty well and would have been more than enough for my simple requirements. In the end I decided to use Terraform for one simple reason: With Terraform you can not only provision on AWS but also on Azure, GCP and many others. So, instead of using a tool which is limited to AWS, using a tool which is vendor independent make much more sense.

For the AWS SSM blog I had several requirements I wanted to address:

  • The playground should run in my on VPC so I will not affect any other people doing tests at the same time
  • Because I had Windows and Linux EC2 instances two security groups should get created, one allowing SSH and one allowing RDP into the EC2 machines
  • Both security groups should allow outbound connections to the internet by using an internet gateway
  • Finally two Windows, two Ubuntu, two Red Hat, two SLES and two CentOS instances should get provisioned

Using Terraform all of this is actually quite simple. The first information you’ll need to tell Terraform is the provider you want to use. In my case this is AWS and I am using my “test” AWS profile which is configured for my AWS command line interface, the AWS region I want to use is Frankfurt:

// set the provider to AWS and the AWS region to eu-central-1
provider "aws" {
  profile    = "test"
  region     = "eu-central-1"
}

Because I want to limit direct access to the EC2 instances to my own IP address for security reasons I am defining a local variable I am changing each time I am setting this up from a different network. Input variables could also be used for that but in my case a local variable is just fine:

locals {
  my_ip        = ["37.201.6.8/32"]
}

The first component that actually gets provisioned on AWS is the VPC usinf the aws_vpc resource::

// create the virtual private network
resource "aws_vpc" "dwe-vpc" {
  cidr_block = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support = true

  tags = {
    Name = "dwe-vpc"
  }
}

This defines the IP range I’ll be using, switches the dns parameters to on and gets a tag. As I want my EC2 instances to be able to connect to the internet an internet gateway gets provisioned in the next step, which is attached to the VPC that was created in the step before:

// create the internet gateway
resource "aws_internet_gateway" "dwe-igw" {
  vpc_id = "${aws_vpc.dwe-vpc.id}"

  tags = {
    Name = "dwe-igw"
  }
}

Next we need a subnet, again attached to the VPC:

// create a dedicated subnet
resource "aws_subnet" "dwe-subnet" {
  vpc_id            = "${aws_vpc.dwe-vpc.id}"
  cidr_block        = "10.0.1.0/24"
  availability_zone = "eu-central-1a"

  tags = {
    Name = "dwe-subnet"
  }
}

Routing tables are an important concept in AWS and define how traffic is routed in the VPC. The routing table below enables traffic to the internet through the internet gateway:

// create routing table which points to the internet gateway
resource "aws_route_table" "dwe-route" {
  vpc_id = "${aws_vpc.dwe-vpc.id}"

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = "${aws_internet_gateway.dwe-igw.id}"
  }

  tags = {
    Name = "dwe-igw"
  }
}

Once the routing table is defined it needs to be attached to the subnet:

// associate the routing table with the subnet
resource "aws_route_table_association" "subnet-association" {
  subnet_id      = "${aws_subnet.dwe-subnet.id}"
  route_table_id = "${aws_route_table.dwe-route.id}"
}

For ssh and rdp access to the EC2 instances two security groups get provisioned:

// create a security group for ssh access to the linux systems
resource "aws_security_group" "dwe-sg-ssh" {
  name        = "dwe-sg-ssh"
  description = "Allow SSH inbound traffic"
  vpc_id      = "${aws_vpc.dwe-vpc.id}"

  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = local.my_ip
  }

  // allow access to the internet
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "dwe-sg-ssh"
  }
}

// create a security group for rdp access to the windows systems
resource "aws_security_group" "dwe-sg-rdp" {
name        = "dwe-sg-rdp"
vpc_id      = "${aws_vpc.dwe-vpc.id}"
description = "Allow RDP inbound traffic"

  ingress {
    from_port   = 3389
    to_port     = 3389
    protocol    = "tcp"
    cidr_blocks = local.my_ip
  }

  // allow access to the internet
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }


  tags = {
    Name = "dwe-sg-rdp"
  }
}

Both define an inbound rule (ssh and rdp) and an outbound rule for being able to connect the internet. Now, as all the basics are there the EC2 instances itself get provisioned based on the building blocks from above:

// create two Ubuntu instances
resource "aws_instance" "i-ubuntu-linux-prod" {
  ami                         = "ami-0cc0a36f626a4fdf5"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device {
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }
  
  tags = {
    Name = "i-ubuntu-linux-prod"
  }  
}

resource "aws_instance" "i-ubuntu-linux-test" {
  ami                         = "ami-0cc0a36f626a4fdf5"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device {
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = {
    Name = "i-ubuntu-linux-test"
  } 
}

// create two Amazon linux instances
resource "aws_instance" "i-amazon-linux-prod" {
  ami                         = "ami-0f3a43fbf2d3899f7"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device {
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = {
    Name = "i-amazon-linux-prod"
  } 
}

resource "aws_instance" "i-amazon-linux-test" {
  ami                         = "ami-0f3a43fbf2d3899f7"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device {
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = {
    Name = "i-amazon-linux-test"
  } 
}

// create two Red Hat linux instances
resource "aws_instance" "i-redhat-linux-prod" {
  ami                         = "ami-0badcc5b522737046"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device {
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = {
    Name = "i-redhat-linux-prod"
  } 
}

resource "aws_instance" "i-redhat-linux-test" {
  ami                         = "ami-0badcc5b522737046"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"
  
  root_block_device {     
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }
  
  tags = { 
    Name = "i-redhat-linux-test"
  }
}

// create two SUSE linux instances
resource "aws_instance" "i-suse-linux-prod" {
  ami                         = "ami-03b86a97a8f02d44e"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device { 
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = { 
    Name = "i-suse-linux-prod"
  }
}

resource "aws_instance" "i-suse-linux-test" {
  ami                         = "ami-03b86a97a8f02d44e"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"
  
  root_block_device {     
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }
  
  tags = { 
    Name = "i-suse-linux-test"
  }
}

// create two CentOS linux instances
resource "aws_instance" "i-centos-linux-prod" {
  ami                         = "ami-04cf43aca3e6f3de3"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device { 
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = { 
    Name = "i-centos-linux-prod"
  }
}

resource "aws_instance" "i-centos-linux-test" {
  ami                         = "ami-04cf43aca3e6f3de3"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"
  
  root_block_device {     
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }
  
  tags = { 
    Name = "i-centos-linux-test"
  }
}

// create two Windows instances
resource "aws_instance" "i-windows-prod" {
  ami           = "ami-034937fd7f621ba85"
  instance_type = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-rdp.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device { 
    volume_size           = "30"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = { 
    Name = "i-windows-prod"
  }
}

resource "aws_instance" "i-windows-test" {
  ami           = "ami-034937fd7f621ba85"
  instance_type = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-rdp.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device { 
    volume_size           = "30"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = {
    Name = "i-windows-test"
  }
}

And that’s it. To check what actually will be done by Terraform there is the “plan” command. As the output is quite long I’ll skip most of it and just present the last few lines:

dwe@dwe:~/Documents/aws/ssm_demo$ terraform plan

...

Plan: 19 to add, 0 to change, 0 to destroy.

------------------------------------------------------------------------

Note: You didn't specify an "-out" parameter to save this plan, so Terraform
can't guarantee that exactly these actions will be performed if
"terraform apply" is subsequently run.

Once you are happy with that you can “apply” the execution plan and everything will get provisioned and you confirmed by “yes”:

dwe@dwe:~/Documents/aws/ssm_demo$ terraform apply

Plan: 19 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

Sit back, relax and one or two minutes later your AWS playground is ready:

aws_vpc.dwe-vpc: Creating...
aws_vpc.dwe-vpc: Creation complete after 2s [id=vpc-026cdc481d5365074]
aws_internet_gateway.dwe-igw: Creating...
aws_subnet.dwe-subnet: Creating...
aws_security_group.dwe-sg-ssh: Creating...
aws_security_group.dwe-sg-rdp: Creating...
aws_subnet.dwe-subnet: Creation complete after 0s [id=subnet-028e27fef8df3b963]
aws_internet_gateway.dwe-igw: Creation complete after 0s [id=igw-0656108a04d5ea0a5]
aws_route_table.dwe-route: Creating...
aws_security_group.dwe-sg-rdp: Creation complete after 1s [id=sg-0764508e3a5234393]
aws_route_table.dwe-route: Creation complete after 1s [id=rtb-07691bc54b40af0ae]
aws_route_table_association.subnet-association: Creating...
aws_instance.i-windows-prod: Creating...
aws_instance.i-windows-test: Creating...
aws_security_group.dwe-sg-ssh: Creation complete after 1s [id=sg-053995952f558a4ff]
aws_instance.i-centos-linux-test: Creating...
aws_instance.i-amazon-linux-prod: Creating...
aws_instance.i-amazon-linux-test: Creating...
aws_instance.i-redhat-linux-prod: Creating...
aws_instance.i-centos-linux-prod: Creating...
aws_instance.i-redhat-linux-test: Creating...
aws_instance.i-ubuntu-linux-test: Creating...
aws_route_table_association.subnet-association: Creation complete after 0s [id=rtbassoc-07c7d4282033c4a71]
aws_instance.i-ubuntu-linux-prod: Creating...
aws_instance.i-windows-prod: Still creating... [10s elapsed]
aws_instance.i-windows-test: Still creating... [10s elapsed]
aws_instance.i-centos-linux-test: Still creating... [10s elapsed]
aws_instance.i-amazon-linux-test: Still creating... [10s elapsed]
aws_instance.i-amazon-linux-prod: Still creating... [10s elapsed]
aws_instance.i-redhat-linux-prod: Still creating... [10s elapsed]
aws_instance.i-centos-linux-prod: Still creating... [10s elapsed]
aws_instance.i-redhat-linux-test: Still creating... [10s elapsed]
aws_instance.i-ubuntu-linux-test: Still creating... [10s elapsed]
aws_instance.i-ubuntu-linux-prod: Still creating... [10s elapsed]
aws_instance.i-amazon-linux-test: Creation complete after 13s [id=i-02706717c3440723a]
aws_instance.i-suse-linux-test: Creating...
aws_instance.i-ubuntu-linux-test: Creation complete after 16s [id=i-0d2999aa319a90a3d]
aws_instance.i-centos-linux-test: Creation complete after 16s [id=i-03923fcf9d5881421]
aws_instance.i-suse-linux-prod: Creating...
aws_instance.i-ubuntu-linux-prod: Creation complete after 16s [id=i-00967725bc758f3ef]
aws_instance.i-redhat-linux-test: Creation complete after 16s [id=i-02a705327fb0acb61]
aws_instance.i-windows-prod: Creation complete after 16s [id=i-09c3fcc90491ef2cf]
aws_instance.i-redhat-linux-prod: Creation complete after 16s [id=i-0161726dfe1ed890b]
aws_instance.i-windows-test: Creation complete after 16s [id=i-02f567d11d32444fd]
aws_instance.i-amazon-linux-prod: Still creating... [20s elapsed]
aws_instance.i-centos-linux-prod: Still creating... [20s elapsed]
aws_instance.i-suse-linux-test: Still creating... [10s elapsed]
aws_instance.i-amazon-linux-prod: Creation complete after 23s [id=i-0b799879e77ce8b33]
aws_instance.i-suse-linux-prod: Still creating... [10s elapsed]
aws_instance.i-centos-linux-prod: Still creating... [30s elapsed]
aws_instance.i-centos-linux-prod: Creation complete after 32s [id=i-0482d958849f86483]
aws_instance.i-suse-linux-test: Still creating... [20s elapsed]
aws_instance.i-suse-linux-prod: Still creating... [20s elapsed]
aws_instance.i-suse-linux-test: Still creating... [30s elapsed]
aws_instance.i-suse-linux-test: Creation complete after 32s [id=i-0b35d559853f9f0d6]
aws_instance.i-suse-linux-prod: Still creating... [30s elapsed]
aws_instance.i-suse-linux-prod: Creation complete after 33s [id=i-062df970b894a23da]

Apply complete! Resources: 19 added, 0 changed, 0 destroyed.

Once you’re done with your tests, simply destroy the whole stuff by usinf the “destroy” command:

dwe@dwe:~/Documents/aws/ssm_demo$ terraform destroy
...
Plan: 0 to add, 0 to change, 19 to destroy.

Do you really want to destroy all resources?
  Terraform will destroy all your managed infrastructure, as shown above.
  There is no undo. Only 'yes' will be accepted to confirm.

  Enter a value: yes
...
aws_security_group.dwe-sg-rdp: Destroying... [id=sg-0764508e3a5234393]
aws_security_group.dwe-sg-rdp: Destruction complete after 1s
aws_security_group.dwe-sg-ssh: Destruction complete after 1s
aws_subnet.dwe-subnet: Destruction complete after 1s
aws_vpc.dwe-vpc: Destroying... [id=vpc-026cdc481d5365074]
aws_vpc.dwe-vpc: Destruction complete after 0s

Destroy complete! Resources: 19 destroyed.

Quite easy, always reproducible and fast.

Cet article A simple Terraform script to create an AWS EC2 playground est apparu en premier sur Blog dbi services.

odacli create-database error DCS-10802: Insufficient disk space on file system: database

10 hours 5 min ago
Introduction

I was reimaging an X6-2M ODA after various patching troubles, and everything was fine. After several databases created, the next ones could no more be created.

DCS-10802: Insufficient disk space on file system: database. Expected free space (MB): {1}, available space (MB): {2}

I’ve spent some time on it, and finally found the cause of the problem. And the solution.

Context

After successfully reimaged an X6-2M ODA with 18.5, and applied the patch for the firmwares, ILOM and disks, I was creating all the databases with odacli with the following commands:


odacli create-database -hm XyXyXyXyXyXy --dbstorage ACFS --characterset WE8MSWIN1252 --databaseUniqueName HPMVRN --dbhomeid '0704ef7c-0cb9-4525-8edb-8d70b8f7ddfb' --dblanguage AMERICAN --dbname HPMVRN --dbshape odb1s --dbterritory AMERICA --no-cdb --no-dbconsole --json
odacli create-database ...

Each database is created with the smallest shape odb1s, as I later fine tune each instance according to my needs.

After the 8th or 9th database created, the next ones ended with a failure:

odacli describe-job -i "2ad9f7e8-331d-4a82-bb7b-c88ddad1cdf8"

Job details
----------------------------------------------------------------
ID: 2ad9f7e8-331d-4a82-bb7b-c88ddad1cdf8
Description: Database service creation with db name: HPMVRN
Status: Failure
Created: November 8, 2019 1:33:23 PM CET
Message: DCS-10802:Insufficient disk space on file system: database. Expected free space (MB): {1}, available space (MB): {2}

Task Name Start Time End Time Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Database Service creation November 8, 2019 1:33:23 PM CET November 8, 2019 1:33:58 PM CET Failure
Database Service creation November 8, 2019 1:33:23 PM CET November 8, 2019 1:33:58 PM CET Failure
Setting up ssh equivalance November 8, 2019 1:33:23 PM CET November 8, 2019 1:33:23 PM CET Success
Creating volume datHPMVRN November 8, 2019 1:33:23 PM CET November 8, 2019 1:33:42 PM CET Success
Creating ACFS filesystem for DATA November 8, 2019 1:33:42 PM CET November 8, 2019 1:33:57 PM CET Success
Database Service creation November 8, 2019 1:33:57 PM CET November 8, 2019 1:33:58 PM CET Failure
Database Creation November 8, 2019 1:33:57 PM CET November 8, 2019 1:33:58 PM CET Failure

Analysis

Error seems obvious: Insufficient disk space. Let’s check the disk space:


df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroupSys-LogVolRoot
30G 5.6G 23G 20% /
tmpfs 126G 1.1G 125G 1% /dev/shm
/dev/sda1 477M 41M 411M 9% /boot
/dev/mapper/VolGroupSys-LogVolOpt
109G 85G 19G 82% /opt
/dev/mapper/VolGroupSys-LogVolU01
148G 31G 110G 22% /u01
/dev/asm/datdbtest-8 100G 2.2G 98G 3% /u02/app/oracle/oradata/DBTEST
/dev/asm/datgotst-8
100G 2.2G 98G 3% /u02/app/oracle/oradata/GOTST
/dev/asm/datgeval-8
100G 2.2G 98G 3% /u02/app/oracle/oradata/GEVAL
/dev/asm/commonstore-8
5.0G 391M 4.7G 8% /opt/oracle/dcs/commonstore
/dev/asm/datsmval-8
100G 44G 57G 44% /u02/app/oracle/oradata/SMVAL
/dev/asm/datvival-8
100G 34G 67G 34% /u02/app/oracle/oradata/VIVAL
/dev/asm/datvjval-8
100G 56G 45G 56% /u02/app/oracle/oradata/VJVAL
/dev/asm/dump-8 200G 132G 69G 66% /dpdumplocal
/dev/asm/dataoval-8
100G 2.2G 98G 3% /u02/app/oracle/oradata/AOVAL
/dev/asm/datgrtst-8 100G 2.2G 98G 3% /u02/app/oracle/oradata/GRTST
/dev/asm/datgival-8
100G 7.8G 93G 8% /u02/app/oracle/oradata/GIVAL
/dev/asm/reco-329 74G 56G 19G 76% /u03/app/oracle
/dev/asm/datgetst-8 100G 2.2G 98G 3% /u02/app/oracle/oradata/GETST
/dev/asm/datgftst-8
100G 30G 71G 30% /u02/app/oracle/oradata/GFTST
/dev/asm/datgctst-8
100G 2.2G 98G 3% /u02/app/oracle/oradata/GCTST
/dev/asm/dathpmvrn-8 100G 448M 100G 1% /u02/app/oracle/oradata/HPMVRN

No filesystem is full. And create-database managed to create the acfs volume for data successfully. Let’s try to put something in it:


cp /the_path/the_big_file /u02/app/oracle/oradata/HPMVRN/

No problem with this acfs volume.

Let’s try to create the database into ASM:


odacli list-databases | tail -n 1

bed7f9a0-e108-4423-8b2c-d7c33c795e87 HPMVRN Si 12.1.0.2 false Oltp Odb1s Acfs Failed 0704ef7c-0cb9-4525-8edb-8d70b8f7ddfb

odacli delete-database -i "bed7f9a0-e108-4423-8b2c-d7c33c795e87"

odacli create-database -hm XyXyXyXyXyXy --dbstorage ASM --characterset WE8MSWIN1252 --databaseUniqueName HPMVRN --dbhomeid '0704ef7c-0cb9-4525-8edb-8d70b8f7ddfb' --dblanguage AMERICAN --dbname HPMVRN --dbshape odb1s --dbterritory AMERICA --no-cdb --no-dbconsole --json

No problem here, really seems to be related to acfs.

Where does the create-database also need free space? For sure, in the RECOvery area filesystem, created with the very first database:


Filesystem Size Used Avail Use% Mounted on
/dev/asm/reco-329 74G 56G 19G 76% /u03/app/oracle

Let’s create a file in this filesystem:

cp /the_path/the_big_file /u03/app/oracle/

No problem.

Quite strange, and as the database is not yet created, there is no alert_HPMVRN.log to look for the error…

Maybe the RECOvery area filesystem is not big enough for odacli. acfs concept means autoextensible filesystems, but as all my Fast Recovery Areas of all my databases probably won’t fit in the allocated 74GB, odacli may fail. How to extend the acfs RECOvery filesystem? Just put enough dummy files in it, and then remove them.


cd /u03/app/oracle
cp /the_path/the_big_file tmpfile1
cp tmpfile1 tmpfile2
cp tmpfile1 tmpfile3
cp tmpfile1 tmpfile4
cp tmpfile1 tmpfile5
cp tmpfile1 tmpfile6
rm -rf tmp*

df -h /u03/app/oracle
Filesystem Size Used Avail Use% Mounted on
/dev/asm/reco-329 88G 59G 30G 67% /u03/app/oracle

RECOvery filesystem is slightly bigger, let’s retry the database creation:


odacli delete-database -i "985b1d37-6f84-4d64-884f-3a429c195a5d"

odacli create-database -hm XyXyXyXyXyXy --dbstorage ACFS --characterset WE8MSWIN1252 --databaseUniqueName HPMVRN --dbhomeid '0704ef7c-0cb9-4525-8edb-8d70b8f7ddfb' --dblanguage AMERICAN --dbname HPMVRN --dbshape odb1s --dbterritory AMERICA --no-cdb --no-dbconsole --json

odacli describe-job -i "d5022d8b-9ddb-4f93-a84a-36477657794f"

Job details
----------------------------------------------------------------
ID: d5022d8b-9ddb-4f93-a84a-36477657794f
Description: Database service creation with db name: HPMVRN
Status: Success
Created: November 8, 2019 1:47:28 PM CET
Message:

Task Name Start Time End Time Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Setting up ssh equivalance November 8, 2019 1:47:30 PM CET November 8, 2019 1:47:40 PM CET Success
Creating volume datHPMVRN November 8, 2019 1:47:40 PM CET November 8, 2019 1:47:59 PM CET Success
Creating ACFS filesystem for DATA November 8, 2019 1:47:59 PM CET November 8, 2019 1:48:14 PM CET Success
Database Service creation November 8, 2019 1:48:14 PM CET November 8, 2019 1:54:38 PM CET Success
Database Creation November 8, 2019 1:48:14 PM CET November 8, 2019 1:53:02 PM CET Success
Change permission for xdb wallet files November 8, 2019 1:53:02 PM CET November 8, 2019 1:53:02 PM CET Success
Place SnapshotCtrlFile in sharedLoc November 8, 2019 1:53:02 PM CET November 8, 2019 1:53:04 PM CET Success
SqlPatch upgrade November 8, 2019 1:54:02 PM CET November 8, 2019 1:54:35 PM CET Success
updating the Database version November 8, 2019 1:54:35 PM CET November 8, 2019 1:54:38 PM CET Success
create Users tablespace November 8, 2019 1:54:38 PM CET November 8, 2019 1:54:40 PM CET Success

This is it. I was then able to create the next databases without any problem.

Final words

Keep in mind that all your Fast Recovery Areas should fit in your dedicated filesystem. To make sure that you will not encounter problems, resize you RECOvery filesystem to the sum of all your target FRA with acfsutil:

acfsutil size 500G –d /dev/asm/reco-329 /u03/app/oracle

Autoextensible filesystem never ensures you it will succeed in case of extension.

Or simply go for ASM instead of acfs on your ODA, if you do not need acfs features like snapshots. ASM is simplier and more efficient because it does not provision disk space for each database like acfs.

Cet article odacli create-database error DCS-10802: Insufficient disk space on file system: database est apparu en premier sur Blog dbi services.

Avoiding patching madness by using AWS SSM

Fri, 2019-11-08 13:15

As we have more and more customers either currently moving to AWS or already moved much, if not even all, of their workloads to AWS, one question pops up quite frequently: Now that we have so many EC2 instances up and running how do we regularly patch them? There are many tools around for patching operating systems, such as Microsoft SSCM, Red Hat Satellite or SUSE Manager just to name a few. There are many other as well but most of them have an important drawback: Either they can only patch Windows systems or they can only patch Linux systems (and even here usually only a subset of the distributions currently in use). This is where AWS System Manager becomes very handy as you can patch most of the commonly used operating system using one single tool. You can even patch your on-prem instances if your network setup includes the on-prem instances as well. In this post I’ll share what we did for one of our customers to remove the pain of either using multiple tools to patch all the operating systems or to manually patch all the systems and then (manually) document what has been done and what was the result.

Managed Instances

Once you start with AWS SSM there are a few key concepts you need to understand. The first one is the so-called “Managed Instance”. For an instance being managed it must fulfill two requirements:

  1. The AWS agent must be deployed on the instance
  2. An IAM role with the correct permission must be assigned to the instance

For the AWS agent it is quite easy if you are deployed any of these because the agent will be there be default:

  • Windows Server 2003-2012 R2 AMIs published in November 2016 or later
  • Amazon Linux
  • Amazon Linux 2
  • Ubuntu Server 16.04
  • Ubuntu Server 18.04

If you are running something else the agent needs to be installed manually as described here for Windows and here for Linux. For on-prem instances it is straight forward as well (Windows here, Linux here) but you need to create Managed Instance Actications as well which is not in the scope of this post.

For the IAM role we’ll first look at what I have currently deployed:

This are two Windows, two Red Hat, two Ubuntu, two Amazon Linux, two CentOS and two SUSE instances, one for Prod and one for Test each. The “Prod” and “Test” naming becomes important later, just ignore it for now. If we take a look at how many managed instances we have right now in System Manager we will see zero, even that the Windows, Ubuntu and Amazon Linux instances have the SSM agent deployed automatically (otherwise we would see a list of managed instances here):

We will use the “Quick Setup” here to setup the IAM role and instance profiles:

For the scope of this post we will go with defaults suggested by AWS:

We want all the instances:


The last screen shows an overview of what is currently happening, basically collecting all the information that will be stored in the AWS SSM inventory. Before you’ll see any instances in “Managed Instances” you will need to wait until the pending inventory actions have been completed:

Once that is done all the managed instances should be there (notice that I manually deployed the ASM SSM agent on the SUSE, CentOS and RedHat instances, otherwise they would not show up here):

Patch baselines

Once all instances are “managed” there is the next concept we need to understand: Patch baselines. Patch baselines basically define which set of patches (classification, severity, …) shall get applied to which product (e.g. Red Hat 7, Red Hat 8, …). As we have six different operating system we need at least six patch baselines. In reality you’ll probably even need twelve because you will have production and test systems. Usually you want to have the same patches applied to production as you have applied to test some days or weeks before and therefore we need two patch baselines per operating system flavor. Patch baselines are listed in SSM under “Patch manager”:

The following screenshots show how I created the patch baseline for the Red Hat test systems:

For the Red Hat production systems the patch baseline is exactly the same but I’ve put an approval delay of 14 days:

Why the delay? Suppose you patch your test systems every first Tuesday a month and then you patch the production systems 14 days later. As you want to have the same patches on production and test, the production patches get an approval delay of 14 days. This is how you can make sure that your systems are on the same level even if production is patched later than test. One point consider is, that this does not apply to the Ubuntu systems as there is no approval delay for Ubuntu patches:

Once all the base lines have been created it looks like this:

The “Patch group” tag

The next key concept that comes into the game are tags. No matter what you well be doing in the cloud, without tags you’re lost. There is just no way around tags if you want to automate processes in the cloud. When it comes the AWS SSM there is the “Patch Group” tag and this tag needs to be assigned to each instance you want to have patched automatically. Let’s assume you want to have the test instances patched every first Saturday each month and the production instances every 3rd Saturday each month ( 6 a.m. in the morning for both). In addition all the Linux hosts should be patched before the Windows hosts, because the Linux hosts run the databases and the Windows hosts run the application servers or the application (of for whatever reason there is a dependency between the hosts and you need to follow a defined order for patching).

What we did to fulfill this requirement is to populate the “Patch Group” tag like this:

  1. All test instances get the value: patch_test_0600_batch_1 or patch_test_0600_batch_2
  2. All production instances get the value: patch_prod_0600_batch_1 or patch_prod_0600_batch_2

The “*batch” values builds the dependency between the Linux and the Windows host. Taking one test Linux and one test Windows as an example the tag would look like this:

I’ve assigned the tags to all the remaining instances accordingly:

  1. i-amazon-linux-test: patch_test_0600_batch_1
  2. i-centos-linux-test: patch_test_0600_batch_1
  3. i-redhat-linux-test: patch_test_0600_batch_1
  4. i-suse-linux-test: patch_test_0600_batch_1
  5. i-ubuntu-linux-test: patch_test_0600_batch_1
  6. i-windows-test: patch_test_0600_batch_2

Same procedure for the production instances but using the prod_* tags of course. Only assigning these tags to the instances is not enough though. We need to assign them to the patch baselines as well so the baseline gets selected when a patch is supposed to run on the systems. Assigning a “Patch Group” to a patch baselines basically links the patch baseline to instances with the same tag value per operating system. If you don’t do that you will always get the default patch baseline for an operating system and you have no chance of applying the same patches to test and prod because you can not have different approval delays.

This needs to be done for all the patch baselines we created above. The patch_test_0600_batch* tags get attached to the baselines for test and the patch_prod_0600_batch* tags get attached to the baselines for production.

Maintenance windows

The last building block are the maintenance windows. They define when an action will executed on which systems. You can not only use them for patching but for many other tasks as well (e.g. running a PowerShell script on Windows hosts, running a shell script on Linux hosts, running an Ansible playbook and much more).

As we want to schedule patching for all the test instances the first Saturday of a month and all production systems the third Saturday of the month we’ll need to two maintenance windows, one for test and for production. We’ll start with the one for test:

For the scheduling you should go for a cron expression:

“6F” means the first Saturday (0 is Sunday) each month. Note that if you skip the “Schedule timezone” timezone you will need to provide UTC time in the cron expression. If you do specify your timezone, AWS is automatically calculating the correct UTC time for you, as you can see once the maintenance window is created:

Now we have the scheduling but the maintenance window has no idea what to run and against what to run it. What to run is specified by adding targets:

You should definitely give a name to the target so you can easily identify the target afterwards:

By specifying the “Patch Group” instance tag with our values for the test systems that should run in the first batch we have all the Linux systems registered with this maintenance window. The second target will be the Windows systems that should be patched once the Linux systems have been done:

All test systems registered:

Now that we defined where to run we need to define what to run and this is specified by the “Run command”:

In our case the document is “AWS-RunPatchBaseline” but it could be any of the documents listed as mentioned earlier:

The task priority is 1 (because we want this to be executed first) and the target is “patch_test_0600_batch_1” (this is why the name for target is important. You will probably not remember the unique string that was generated when you registered the target):

The rate control option allows you to control concurrenry and error thresholds, for the IAM role go with the defaults:

We highly recommend to store the output in a S3 bucket, otherwise the output of the run command will be truncated after 2000 charcters. The parameter should be “Install” (you could also go for “Scan” if you only want to scan for available patches):

Do exactly the same for the patch_test_0600_batch_2 group, with priority 2:

Both run commands are now registered:

And that’s it. Repeat the same for the production machines and all your instances are scheduled for patching. Once the maintenance window executed you can get the logs in the “Run command” secion:



Here you have the link to the log in S3:

Conclusion

When you start using AWS SSM it is a bit hard to understand all the relations between patch baselines, the patch group tag, target, run commands and maintenance windows. Once you got that it is quite easy to schedule patching of your whole infratructure (even on prem) when you run operation systems AWS has patch baselines for. The logs should definitely go to S3 so you have the full output available.

Cet article Avoiding patching madness by using AWS SSM est apparu en premier sur Blog dbi services.

Connection pooling with PgBouncer

Thu, 2019-11-07 09:38

Some of you may know the case: As soon as the number of users grow, the number of resource problems increases. Have you ever thought about using a connection pooler? Too complex, too much administration effort? In this post I like to explain, how the connection pooler can help you with your memory, as well as showing you the simplicity of setup connection pooling with PgBouncer.

Introduction

By default PostgreSQL forks it’s main process to child processes for every new connection. In the course of time this can lead to more and more processes on the server. On one hand, this is pretty cool, because it can provide more stability and a nice view of resource utilization per connection. But if there are many short time connections, the disadvantages will predominate. The more connections you have, the more RAM will be used.
The solution for that problem can be a connection pooler like PgBouncer, an opensource connection pooling middleware espacially designed for Postgres. It will act like a wrapper around the database connections. It has the internals for the connection between the database and the pool, but everything is hidden from the application that connects.

Installation

For the installation of pgBouncer you can decide between two possibilities
1. using yum
2. building from git (https://pgbouncer.github.io/install.html#building-from-git)

To keep it simple, we go on with method 1.

postgres@centos-mini:/u02/pgdata/11/PG1/ [PG1] sudo yum install pgbouncer
Loaded plugins: fastestmirror
Determining fastest mirrors
epel/x86_64/metalink                                                                |  28 kB  00:00:00
 * base: pkg.adfinis-sygroup.ch
 * epel: pkg.adfinis-sygroup.ch
 * extras: pkg.adfinis-sygroup.ch
 * updates: pkg.adfinis-sygroup.ch
base                                                                                | 3.6 kB  00:00:00
epel                                                                                | 5.3 kB  00:00:00
extras                                                                              | 2.9 kB  00:00:00
ius                                                                                 | 1.3 kB  00:00:00
pgdg10                                                                              | 3.6 kB  00:00:00
pgdg11                                                                              | 3.6 kB  00:00:00
updates                                                                             | 2.9 kB  00:00:00
(1/10): base/7/x86_64/group_gz                                                      | 165 kB  00:00:06
(2/10): extras/7/x86_64/primary_db                                                  | 153 kB  00:00:00
(3/10): epel/x86_64/group_gz                                                        |  90 kB  00:00:06
(4/10): epel/x86_64/primary_db                                                      | 6.9 MB  00:00:08
(5/10): epel/x86_64/updateinfo                                                      | 1.0 MB  00:00:08
(6/10): pgdg11/7/x86_64/primary_db                                                  | 337 kB  00:00:01
(8/10): base/7/x86_64/primary_db                                                    | 6.0 MB  00:00:10
(10/10): updates/7/x86_64/primary_db                                                | 2.8 MB  00:00:01
(11/10): ius/x86_64/primary                                                         | 139 kB  00:00:06
(12/10): pgdg10/7/x86_64/primary_db                                                 | 384 kB  00:00:06
ius                                                                                              631/631
Resolving Dependencies
--> Running transaction check
---> Package pgbouncer.x86_64 0:1.12.0-1.rhel7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

===========================================================================================================================================================================================================
 Package                           Arch                                               Version                                                     Repository                                          Size
===========================================================================================================================================================================================================
Installing:
 pgbouncer                         x86_64                                             1.12.0-1.rhel7                                              pgdg10                                             207 k

Transaction Summary
===========================================================================================================================================================================================================
Install  1 Package

Total download size: 207 k
Installed size: 477 k
Is this ok [y/d/N]: y
Downloading packages:
pgbouncer-1.12.0-1.rhel7.x86_64.rpm                                                                                                                                                | 207 kB  00:00:06
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : pgbouncer-1.12.0-1.rhel7.x86_64                                                                                                                                                     1/1
  Verifying  : pgbouncer-1.12.0-1.rhel7.x86_64                                                                                                                                                     1/1

Installed:
  pgbouncer.x86_64 0:1.12.0-1.rhel7

Complete!
Configuration pgbouncer.ini

PgBouncer has one central congfiguration file called pgbouncer.ini, per default it is located under /etc/pgbouncer and it is used to configure the PgBouncer pool.
You can define a lot of parameters in this file, most of them commented out by default and you can start with a very minimal configuration. It includes generic settings as logile, listen_addr, listen_port… as well as connectivity, console access control, admin_users and many, many more. The full list of parameters can be found in the pgbouncer documentation, which is really detailled and gives you a good overview.
Lets have a look to a easy sample of pgbouncer.ini file

cat /etc/pgbouncer/pgbouncer.ini
[databases]
bouncer= host=localhost dbname=bouncer

[pgbouncer]
listen_addr=127.0.0.1
listen_port= 6432
auth_type = md5
auth_file= /etc/pgbouncer/userlist.txt
admin_users=bounce
pool_mode=session
max_client_conn = 100
default_pool_size = 20
logfile = /etc/pgbouncer/log/pgbouncer.log
pidfile = /etc/pgbouncer/pgbouncer.pid

The ini-file has two sections, first the [database] section defines the alias(es) for the database(s). Just start a new line for every database. You can also define user and port etc. and afterwards the [pgbouncer] section, where the pool configuration is done. You can also start with a very minimal configuration.

One of the most important parameters is pool_mode, which defines how a server connection can be reused. 3 modes can be defined:
session: This is the default value: Connections are released back to the pool after disconnection.
transaction: Releases the connection to the pool once a transaction finishes.
statement: After a query finishes, the connection is released back to he pool.

The other parameters in section pgbouncer shortly explained:
listen_addr: List of addresses where to listen for TCP connection.
listen_port: Listening port
admin_users: Users from the auth_file which get access to a special pgbouncer database. This database provides performance-related information about PgBouncer.
max_client_conn: This is the maximum number of client connections allowed. The default value is 100, but there is also a formula to calculate the value
default_pool_size: The number of server connections allowed per user/database pair. The default value is 20.
logfile: This one is self-explaining. The log file location.
pidfile: The location of the PID file.
auth_type and auth_file: This two belong together. Auth_type specifies how to authenticate users (pam|hba|md5) against PgBouncer and auth_file contains the usernames and passwords.

userlist.txt

As already mentioned this file has a very simple structure, username and password. You can either write the password in plain text or the MD5 has of the password. Of course it is not very secure to put the plain text password in here.

cat /etc/pgbouncer/userlist.txt
"bounce" "md51db1c086e81505132d1834e06d07420e"
"postgres" "md53175bce1d3201d16594cebf9d7eb3f9d"
Start PgBouncer

Now all the configuration is done and PgBouncer can be started. It is possible to start PgBouncer on command line and you see the log output directly:

postgres@centos-mini:/etc/pgbouncer/ [PG1] /bin/pgbouncer pgbouncer.ini
2019-11-06 19:40:05.862 CET [13498] LOG kernel file descriptor limit: 1024 (hard: 4096); max_client_conn: 100, max expected fd use: 192
2019-11-06 19:40:05.864 CET [13498] LOG listening on 127.0.0.1:16432
2019-11-06 19:40:05.864 CET [13498] LOG listening on unix:/tmp/.s.PGSQL.16432
2019-11-06 19:40:05.864 CET [13498] LOG process up: PgBouncer 1.12.0, libevent 2.0.21-stable (epoll), adns: libc-2.17, tls: OpenSSL 1.0.2k-fips  26 Jan 2017
2019-11-06 19:41:05.868 CET [13498] LOG stats: 0 xacts/s, 0 queries/s, in 0 B/s, out 0 B/s, xact 0 us, query 0 us, wait 0 us
2019-11-06 19:41:39.325 CET [13498] LOG C-0x18da360: db1/user1@127.0.0.1:58648 login attempt: db=db1 user=user1 tls=no
2019-11-06 19:41:39.326 CET [13498] LOG C-0x18da360: db1/user1@127.0.0.1:58648 closing because: client unexpected eof (age=0s)
2019-11-06 19:41:47.577 CET [13498] LOG C-0x18da360: db1/user1@127.0.0.1:58652 login attempt: db=db1 user=user1 tls=no
2019-11-06 19:41:47.579 CET [13498] LOG S-0x18e0c30: db1/user1@[::1]:5432 new connection to server (from [::1]:37654)
2019-11-06 19:42:05.869 CET [13498] LOG stats: 0 xacts/s, 0 queries/s, in 0 B/s, out 0 B/s, xact 0 us, query 0 us, wait 157 us
2019-11-06 19:43:05.872 CET [13498] LOG stats: 0 xacts/s, 0 queries/s, in 0 B/s, out 2 B/s, xact 1522 us, query 1522 us, wait 0 us

Furthermore it is possible to create a service which starts automatically in the background after every reboot:

cat /etc/systemd/system/pgbouncer.service
[Unit]
Description=A lightweight connection pooler for PostgreSQL
After=syslog.target
After=network.target

[Service]
Type=simple

User=postgres
Group=postgres

# Path to the init file
Environment=BOUNCERCONF=/etc/pgbouncer/pgbouncer.ini

PIDFile=/var/run/pgbouncer/pgbouncer.pid

# Where to send early-startup messages from the server
# This is normally controlled by the global default set by systemd
# StandardOutput=syslog

ExecStart=/bin/pgbouncer ${BOUNCERCONF}
ExecReload=/bin/kill -HUP $MAINPID
KillSignal=SIGINT

# Give a reasonable amount of time for the server to start up/shut down
TimeoutSec=300

[Install]
WantedBy=multi-user.target
postgres@centos-mini:/etc/ [PG1] sudo systemctl start pgbouncer
postgres@centos-mini:/etc/ [PG1] sudo systemctl status pgbouncer
● pgbouncer.service - A lightweight connection pooler for PostgreSQL
   Loaded: loaded (/etc/systemd/system/pgbouncer.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2019-11-07 15:17:09 CET; 4s ago
 Main PID: 17298 (pgbouncer)
   CGroup: /system.slice/pgbouncer.service
           └─17298 /bin/pgbouncer /etc/pgbouncer/pgbouncer.ini

Nov 07 15:17:09 centos-mini systemd[1]: Started A lightweight connection pooler for PostgreSQL.
Nov 07 15:17:09 centos-mini pgbouncer[17298]: 2019-11-07 15:17:09.127 CET [17298] LOG kernel file descriptor limit: 1024 (hard: 4096); max_client_conn: 100, max expected fd use: 172
Nov 07 15:17:09 centos-mini pgbouncer[17298]: 2019-11-07 15:17:09.127 CET [17298] LOG listening on 127.0.0.1:6432
Nov 07 15:17:09 centos-mini pgbouncer[17298]: 2019-11-07 15:17:09.127 CET [17298] LOG listening on unix:/tmp/.s.PGSQL.6432
Nov 07 15:17:09 centos-mini pgbouncer[17298]: 2019-11-07 15:17:09.127 CET [17298] LOG process up: PgBouncer 1.12.0, libevent 2.0.21-stable (epoll), adns: libc-2.17, tls: OpenSSL 1.0.2k-fips  26 Jan 2017
15:17:13 postgres@centos-mini:/etc/ [PG1]

PgBouncer is running now and you can connect to the database using PgBouncer.

postgres@centos-mini:/etc/ [PG1] psql -U bounce -p 6432 -h localhost bouncer
Password for user bounce:
psql (11.4 dbi services build)
Type "help" for help.
bouncer=>
bouncer=>
Monitoring

I already mentioned the admin users before and want to explain them a little bit more detailed now.
PgBouncer allows users with admin rights to connect to the virtual database “pgbouncer”. You can use this database to see who is connecting, how many active connections are in a pool and of course many more. As a good starting point, use “SHOW HELP” as soon as you are connected.

postgres@centos-mini:/etc/ [PG1] psql -U bounce -p 6432 -h localhost pgbouncer
Password for user bounce:
psql (11.4 dbi services build, server 1.12.0/bouncer)
Type "help" for help.
pgbouncer=# SHOW HELP;
NOTICE:  Console usage
DETAIL:
        SHOW HELP|CONFIG|DATABASES|POOLS|CLIENTS|SERVERS|USERS|VERSION
        SHOW FDS|SOCKETS|ACTIVE_SOCKETS|LISTS|MEM
        SHOW DNS_HOSTS|DNS_ZONES
        SHOW STATS|STATS_TOTALS|STATS_AVERAGES|TOTALS
        SET key = arg
        RELOAD
        PAUSE []
        RESUME []
        DISABLE 
        ENABLE 
        RECONNECT []
        KILL 
        SUSPEND
        SHUTDOWN
SHOW
pgbouncer=#
pgbouncer=# SHOW POOLS;
 database  |   user    | cl_active | cl_waiting | sv_active | sv_idle | sv_used | sv_tested | sv_login | maxwait | maxwait_us | pool_mode
-----------+-----------+-----------+------------+-----------+---------+---------+-----------+----------+---------+------------+-----------
 bouncer   | bounce    |         2 |          0 |         0 |       0 |       1 |         0 |        0 |       0 |          0 | session
 pgbouncer | pgbouncer |         1 |          0 |         0 |       0 |       0 |         0 |        0 |       0 |          0 | statement
(2 rows)

pgbouncer=# SHOW CLIENTS;
 type |  user  | database  | state  |   addr    | port  | local_addr | local_port |      connect_time       |      request_time       | wait | wait_us | close_needed |    ptr    | link | remote_pid | tls
------+--------+-----------+--------+-----------+-------+------------+------------+-------------------------+-------------------------+------+---------+--------------+-----------+------+------------+-----
 C    | bounce | bouncer   | active | 127.0.0.1 | 40322 | 127.0.0.1  |       6432 | 2019-11-07 15:24:40 CET | 2019-11-07 15:24:40 CET |    0 |       0 |            0 | 0x1bd9598 |      |          0 |
 C    | bounce | bouncer   | active | 127.0.0.1 | 40332 | 127.0.0.1  |       6432 | 2019-11-07 15:25:12 CET | 2019-11-07 15:25:12 CET |    0 |       0 |            0 | 0x1bd97b0 |      |          0 |
 C    | bounce | pgbouncer | active | 127.0.0.1 | 40314 | 127.0.0.1  |       6432 | 2019-11-07 15:21:50 CET | 2019-11-07 15:25:36 CET |  221 |  440169 |            0 | 0x1bd9380 |      |          0 |
(3 rows)
Conclusion

The above configuration is only a simple example how the configuration can look like. Of course there are many more specifics you can define. PgBouncer is a great tools for connection pooling and can help you to scale down the memory usage of your server. The connections of a pool are always available and in contrast to forking processes, it does not need reach out the database to establish a connection every time. The connections are just there.

Cet article Connection pooling with PgBouncer est apparu en premier sur Blog dbi services.

How to scale up a Patroni cluster

Wed, 2019-11-06 04:43

During the preparation of my presentation for the pgconf.eu I ran into one big issue. I had to stop my cluster to add a new node. That was not the way I wanted to archive this. I want a high availability solution, that can be scaled up without any outage. Due to a little hint during the pgconf.eu I was able to find a solution. In this post I will show the manually scale up, without using a playbook.

Starting position

We start with a 3 node patroni cluster which can be created using this blog post.
Now we want to add a fourth node to the existing etcd and Patroni cluster. In case you also need a playbook to install a forth node, check out my GitHub repository.

Scale up the etcd cluster

This step is only needed, when you want to scale up your etcd cluster as well. To scale up a Patroni cluster it is not necessary to scale up etcd cluster. You can, of course, scale up Patroni without adding more etcd cluster members. But maybe someone also needs to scale up his etcd cluster and searches for a solution. If not, just jump to the next step.

Be sure the etcd and patroni service are not started on the forth node.

postgres@patroni4:/home/postgres/ [PG1] systemctl status etcd
● etcd.service - dbi services etcd service
   Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: inactive (dead)
postgres@patroni4:/home/postgres/ [PG1] systemctl status patroni
● patroni.service - dbi services patroni service
   Loaded: loaded (/etc/systemd/system/patroni.service; enabled; vendor preset: disabled)
   Active: inactive (dead)
postgres@patroni4:/home/postgres/ [PG1]

Make the following adjustments in the etcd.conf of the 4th node.

postgres@patroni4:/home/postgres/ [PG1] cat /u01/app/postgres/local/dmk/etc/etcd.conf
name: patroni4
data-dir: /u02/pgdata/etcd
initial-advertise-peer-urls: http://192.168.22.114:2380
listen-peer-urls: http://192.168.22.114:2380
listen-client-urls: http://192.168.22.114:2379,http://localhost:2379
advertise-client-urls: http://192.168.22.114:2379
initial-cluster-state: 'existing'
initial-cluster: patroni1=http://192.168.22.111:2380,patroni2=http://192.168.22.112:2380,patroni3=http://192.168.22.113:2380,patroni4=http://192.168.22.114:2380

Next add the new etcd member to the existing etcd cluster. You can execute that on every existing member of the cluster.

postgres@patroni1:/home/postgres/ [PG1] etcdctl member add patroni4 http://192.168.22.114:2380
Added member named patroni4 with ID dd9fab8349b3cfc to cluster

ETCD_NAME="patroni4"
ETCD_INITIAL_CLUSTER="patroni4=http://192.168.22.114:2380,patroni1=http://192.168.22.111:2380,patroni2=http://192.168.22.112:2380,patroni3=http://192.168.22.113:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"

Now you can start the etcd service on the 4th node.

postgres@patroni4:/home/postgres/ [PG1] sudo systemctl start etcd
postgres@patroni4:/home/postgres/ [PG1] systemctl status etcd
● etcd.service - dbi services etcd service
   Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2019-10-17 16:39:16 CEST; 9s ago
 Main PID: 8239 (etcd)
   CGroup: /system.slice/etcd.service
           └─8239 /u01/app/postgres/local/dmk/bin/etcd --config-file /u01/app/postgres/local/dmk/etc/etcd.conf
postgres@patroni4:/home/postgres/ [PG1]

And after a short check, we can see, that Node 4 is added to the existing cluster

postgres@patroni4:/home/postgres/ [PG1] etcdctl member list
dd9fab8349b3cfc: name=patroni4 peerURLs=http://192.168.22.114:2380 clientURLs=http://192.168.22.114:2379 isLeader=false
16e1dca5ee237693: name=patroni1 peerURLs=http://192.168.22.111:2380 clientURLs=http://192.168.22.111:2379 isLeader=false
28a43bb36c801ed4: name=patroni2 peerURLs=http://192.168.22.112:2380 clientURLs=http://192.168.22.112:2379 isLeader=false
5ba7b55764fad76e: name=patroni3 peerURLs=http://192.168.22.113:2380 clientURLs=http://192.168.22.113:2379 isLeader=true
Scale up Patroni

Scale up the Patroni cluster is also really easy.
Adjust the host entry in the patroni.yml on the new node.

postgres@patroni4:/home/postgres/ [PG1] cat /u01/app/postgres/local/dmk/etc/patroni.yml | grep hosts
  hosts: 192.168.22.111:2379,192.168.22.112:2379,192.168.22.113:2379,192.168.22.114:2379

Afterwards, start the Patroni service.

postgres@patroni4:/home/postgres/ [PG1] sudo systemctl start patroni
postgres@patroni4:/home/postgres/ [PG1] systemctl status patroni
● patroni.service - dbi services patroni service
   Loaded: loaded (/etc/systemd/system/patroni.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2019-10-17 17:03:19 CEST; 5s ago
  Process: 8476 ExecStartPre=/usr/bin/sudo /bin/chown postgres /dev/watchdog (code=exited, status=0/SUCCESS)
  Process: 8468 ExecStartPre=/usr/bin/sudo /sbin/modprobe softdog (code=exited, status=0/SUCCESS)
 Main PID: 8482 (patroni)
   CGroup: /system.slice/patroni.service
           ├─8482 /usr/bin/python2 /u01/app/postgres/local/dmk/bin/patroni /u01/app/postgres/local/dmk/etc/patroni.yml
           ├─8500 /u01/app/postgres/product/11/db_5/bin/postgres -D /u02/pgdata/11/PG1/ --config-file=/u02/pgdata/11/PG1/postgresql.conf --listen_addresses=192.168.22.114 --max_worker_processes=8 --max_locks_per_transact...
           ├─8502 postgres: PG1: logger
           ├─8503 postgres: PG1: startup   waiting for 000000020000000000000006
           ├─8504 postgres: PG1: checkpointer
           ├─8505 postgres: PG1: background writer
           ├─8506 postgres: PG1: stats collector
           ├─8507 postgres: PG1: walreceiver
           └─8513 postgres: PG1: postgres postgres 192.168.22.114(48882) idle

To be sure, everything runs correctly, check the status of the Patroni cluster

postgres@patroni4:/home/postgres/ [PG1] patronictl list
+---------+----------+----------------+--------+---------+----+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB |
+---------+----------+----------------+--------+---------+----+-----------+
|   PG1   | patroni1 | 192.168.22.111 |        | running |  2 |       0.0 |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  2 |       0.0 |
|   PG1   | patroni3 | 192.168.22.113 | Leader | running |  2 |       0.0 |
|   PG1   | patroni4 | 192.168.22.114 |        | running |  2 |       0.0 |
+---------+----------+----------------+--------+---------+----+-----------+
Conclusion

Using the playbooks had one failure. The entry for host in the patroni.yml is only checking localhost. When starting the fourth node, Patroni is not looking for all the other hosts, it is just looking for its own availability. This works fine in an initial cluster, but not when you want to extended one.
And: Always keep in mind, you need an uneven number of members for an etcd cluster, don’t add only a forth etcd node.

Cet article How to scale up a Patroni cluster est apparu en premier sur Blog dbi services.

Galera Cluster 4 with MariaDB 10.4.8

Wed, 2019-11-06 02:31

Last month, by a new customer I had to install the latest version of the MariaDB server: 10.4.8 to setup a Galera Cluster with 3 master nodes.
The good news was that this version was shipped with the latest version of the Galera Plugin from Codership: Galera Cluster 4.0.
As usual, installation & configuration was quitte easy.

$ sudo yum -y install MariaDB-server
$ sudo yum list installed|grep -i mariadb
MariaDB-client.x86_64 10.4.8-1.el7.centos @mariadb-main
MariaDB-common.x86_64 10.4.8-1.el7.centos @mariadb-main
MariaDB-compat.x86_64 10.4.8-1.el7.centos @mariadb-main
MariaDB-server.x86_64 10.4.8-1.el7.centos @mariadb-main
galera.x86_64 26.4.2-1.rhel7.el7.centos @mariadb-main
New Features:

But now I just want to introduce some of the new interesting and high level features available in this version.

Streaming Replication

In the previous versions, when we had large and long-running write transactions, we had always conflicts during
the Certification Based Replication and because of this, often transactions were aborted and rolled back.
Some of our customers were really suffering and complaining because of this problem and limitation.
Now, when there will be a big transaction, the node who initiated the transaction will not have to wait till the “commit”
but will break it into fragments, will certify it and will replicate it on the other master nodes while still the transaction will be running.

New System Tables

When having a look to the mysql database, we can see 3 new system tables.
They are containing informations that are already in status variables but now they will be persistent:

MariaDB [mysql]> show tables from mysql like 'wsrep_%';
+---------------------------+
| Tables_in_mysql (wsrep_%) |
+---------------------------+
| wsrep_cluster             |
| wsrep_cluster_members     |
| wsrep_streaming_log.      |
+---------------------------+
3 rows in set (0.000 sec)

These 3 new tables should bring to the database administrators a better overview of the current status of the cluster.

MariaDB [mysql]> select * from wsrep_cluster;
+--------------------------------------+---------+------------+------------------+--------------+
| cluster_uuid                         | view_id | view_seqno | protocol_version | capabilities |
+--------------------------------------+---------+------------+------------------+--------------+
| 6c41b92b-e0f9-11e8-9924-de3112d0ce21 | 3       | 1967.      | 4.               | 184703       |
+--------------------------------------+---------+------------+------------------+--------------+
1 row in set (0.000 sec)

cluster_uuid is the uuid of the cluster, corresponding to the status variable: wsrep_cluster_state_uuid
view_id is the number of cluster configuration changes, corresponding to the status variable wsrep_cluster_conf_id
view_seqno is the latest Galera sequence number, corresponding to the status variable: wsrep_last_committed
protocol_version is the MariaDB wsrep patch version, corresponding to the status variable: wsrep_protocol_version
capabilities is the capabilities bitmask provided by the Galera library.

MariaDB [mysql]> select * from wsrep_cluster_members;
+--------------------------------------+--------------------------------------+-----------+-----------------------+
| node_uuid                            | cluster_uuid                         | node_name | node_incoming_address |
+--------------------------------------+--------------------------------------+-----------+-----------------------+
| 6542be69-ffd5-11e9-a2ed-a363df0547d5 | 6c41b92b-e0f9-11e8-9924-de3112d0ce21 | node1     | AUTO                  |
| 6ae6fec5-ffd5-11e9-bb70-da54860baa6d | 6c41b92b-e0f9-11e8-9924-de3112d0ce21 | node2.    | AUTO                  |
| 7054b852-ffd5-11e9-8a45-72b6a9955d28 | 6c41b92b-e0f9-11e8-9924-de3112d0ce21 | node3     | AUTO                  |
+--------------------------------------+--------------------------------------+-----------+-----------------------+
3 rows in set (0.000 sec)

This system table display the current membership of the cluster.
It contains a row for each node and member in the cluster.
node_uuid is the unique identifier of the master node.
cluster_uuid is the unique identifier of the cluster. It must be the same for all members.
node_name is explicit
node_incoming_address stores the IP address and port for client connections.

MariaDB [mysql]> select * from wsrep_streaming_log;
Empty set (0.000 sec)

This system table will contains rows only if there is a transaction which have the “streaming replication” enabled

Synchronization Functions

This new SQL functions can be used in wsrep synchronization operations.
It is possibble to use them to obtain the GTID (Global Transaction ID)
WSREP_LAST_SEEN_GTID(): returns the GTID of the last write transaction observed by the client
WSREP_LAST_WRITTEN_GTID(): returns the GTID of the last write transaction made by the client
WSREP_SYNC_WAIT_UPTO_GTID(): blocks the client until the node applies and commits the given transaction

Conclusion:

These new features and especially the streaming replication, which is really an improvement and a huge boost to large
transaction support, should bring to users and dba’s satisfaction and hopefully a better opinion of the MariaDB Galera cluster.
In another blog, I will try to demonstrate how this streaming replication works.

Cet article Galera Cluster 4 with MariaDB 10.4.8 est apparu en premier sur Blog dbi services.

Create a Kubernetes cluster with Google Kubernetes Engine

Wed, 2019-11-06 01:16

Nowadays the market for cloud providers is very competitive. Large companies are fighting a very hard battle over the services they provide. Each offers a wide range of more or less identical products with specific features for each.

In my point of view, having deployed Kubernetes clusters in several environments (Cloud and On-Premise), I pay particular attention to Google Cloud for its Google Kubernetes Engine offer. The deployment of a Kubernetes cluster is very fast and allows us to have a test/production environment in a few minutes.

Therefore, in this blog post, we will explain how to create a Kubernetes cluster in Google Cloud with some useful additional resources.

Prerequisites

A Google account is needed. You can create one by following the sign-up link: https://cloud.google.com. Otherwise, you can use the free tier account: https://cloud.google.com/free/?hl=en.

Create your project

Go to the cloud portal through the following link: https://console.cloud.google.com/home/dashboard

The first step is the creation of a project. Before creating a resource, you will need to create a project in order to encapsulate all your resources within it. To properly create a project, follow the below steps:

Enter your project name and click on create:

After a few seconds, your project will be created, and you will have access to the home dashboard:

Create your cluster

Once the project is created and ready to use, let’s create now our Kubernetes cluster. Click on the Kubernetes Engine menu and clusters sub-menu to begin the creation process.

Once the Kubernetes Engine API is enabled, we can click on the create cluster button and configured our cluster as needed.

We choose a standard cluster with 3 cluster nodes. You can edit the resources of your cluster according to your needs. For our example, we kept the default configuration provided by the API.

Click on the create button and after a few minutes, your cluster is ready for usage.

Start using your Kubernetes cluster

Google SDK is needed to use your Kubernetes cluster in your favorite client platform. To install Google SDK follow the instructions here:

SDK Cloud is properly installed, we can now initialize our environment by the following steps:

mehdi@MacBook-Pro: gcloud init
Welcome! This command will take you through the configuration of gcloud.
 
Settings from your current configuration [default] are:
core:
  account: mehdi.bada68@gmail.com
  disable_usage_reporting: 'True'
  project: jx-k8s-2511
 
Pick configuration to use:
 [1] Re-initialize this configuration [default] with new settings
 [2] Create a new configuration
Please enter your numeric choice:  1
 
Your current configuration has been set to: [default]
 
You can skip diagnostics next time by using the following flag:
  gcloud init --skip-diagnostics
 
Network diagnostic detects and fixes local network connection issues.
Checking network connection...done.
Reachability Check passed.
Network diagnostic passed (1/1 checks passed).
 
Choose the account you would like to use to perform operations for
this configuration:
 [1] mehdi.bada68@gmail.com
 [2] Log in with a new account
Please enter your numeric choice:  1
 
You are logged in as: [mehdi.bada68@gmail.com].
 
Pick cloud project to use:
 [1] kubernetes-infra-258110
 [2] Create a new project
Please enter numeric choice or text value (must exactly match list
item):  1
 
Your current project has been set to: [kubernetes-infra-258110].
 
Do you want to configure a default Compute Region and Zone? (Y/n)?  Y
 
Which Google Compute Engine zone would you like to use as project
default?
If you do not specify a zone via a command-line flag while working
with Compute Engine resources, the default is assumed.
 
Please enter numeric choice or text value (must exactly match list
item):  8

Login now to gcloud :

mehdi@MacBook-Pro: gcloud auth login
… 
You are now logged in as [mehdi.bada68@gmail.com].
Your current project is [kubernetes-infra-258110].  You can change this setting by running:
  $ gcloud config set project PROJECT_ID

Update your ~./kube/config file with the credentials of the new cluster created before:

mehdi@MacBook-Pro: gcloud container clusters get-credentials standard-cluster-1
Fetching cluster endpoint and auth data.
kubeconfig entry generated for standard-cluster-1.

Your kubectl client is now connected to your remote GKE cluster.

mehdi@MacBook-Pro: kubectl get nodes -o wide
NAME                                                STATUS   ROLES    AGE   VERSION          INTERNAL-IP   EXTERNAL-IP     OS-IMAGE                             KERNEL-VERSION   CONTAINER-RUNTIME
gke-standard-cluster-1-default-pool-1ac453ab-6tj4   Ready       56m   v1.13.11-gke.9   10.128.0.3    34.70.191.147   Container-Optimized OS from Google   4.14.145+        docker://18.9.7
gke-standard-cluster-1-default-pool-1ac453ab-s242   Ready       56m   v1.13.11-gke.9   10.128.0.4    35.188.3.165    Container-Optimized OS from Google   4.14.145+        docker://18.9.7
gke-standard-cluster-1-default-pool-1ac453ab-w0j0   Ready       56m   v1.13.11-gke.9   10.128.0.2    34.70.107.231   Container-Optimized OS from Google   4.14.145+        docker://18.9.7
Deploy Kubernetes Dashboard

After configuring the kubectl client we can start deploying resources on the Kubernetes cluster. One of the most popular resources in Kubernetes is the dashboard. It allows users and admin having a graphical view of all cluster resources.

Download the dashboard deployment locally:

curl -o dashboard.yaml  https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta4/aio/deploy/recommended.yaml

Then apply the deployment:

mehdi@MacBook-Pro: kubectl apply -f dashboard.yaml
namespace/kubernetes-dashboard created
serviceaccount/kubernetes-dashboard created
service/kubernetes-dashboard created
secret/kubernetes-dashboard-certs created
secret/kubernetes-dashboard-csrf created
secret/kubernetes-dashboard-key-holder created
configmap/kubernetes-dashboard-settings created
role.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created
rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
deployment.apps/kubernetes-dashboard created
service/dashboard-metrics-scraper created
deployment.apps/dashboard-metrics-scraper created

Create an admin Service Account and Cluster Role Binding that you can use to securely connect to the dashboard with admin-level permissions:

mehdi@MacBook-Pro: vi admin-sa.yaml 

apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin
  namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: admin
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: admin
  namespace: kubernetes-dashboard

mehdi@MacBook-Pro: kubectl apply -f admin-sa.yaml
serviceaccount/admin created
clusterrolebinding.rbac.authorization.k8s.io/admin created

First, retrieve the authentication token for the admin service account, as below:

mehdi@MacBook-Pro: kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin | awk '{print $1}')
Name:         admin-token-dpsl9
Namespace:    kubernetes-dashboard
Labels:       
Annotations:  kubernetes.io/service-account.name: admin
              kubernetes.io/service-account.uid: 888de3dc-ffff-11e9-b5ca-42010a800046

Type:  kubernetes.io/service-account-token

Data
====
ca.crt:     1119 bytes
namespace:  20 bytes
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi10b2tlbi1kcHNsOSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJhZG1pbiIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6Ijg4OGRlM2RjLWZmZmYtMTFlOS1iNWNhLTQyMDEwYTgwMDA0NiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlcm5ldGVzLWRhc2hib2FyZDphZG1pbiJ9.DBrfylt1RFDpHEuTy4l0BY-kRwFqm9Tvfne8Vu-IZVghy87vVWtsCatjt2wzCtMjX-I5oB0YAYmio7pTwPV-Njyd_VvbWupqOF7yiYE72ZXri0liLnQN5qbtyOmswsjim0ehG_yQSHaAqp21cQdPXb59ItBLN7q0-dh8wBRyOMAVLttjbmzBb02XxtJlALYg8F4hAkyHjJAzHAyntMylUXyS2gn471WUYFs1usDDpA8uZRU3_K6oyccXa-xqs8kKRB1Ch6n4Cq9TeMKkoUyv0_alEEQvwkp_uQCl2Rddk7bLNnjfDXDPC9LXOT-2xfvUf8COe5dO-rUXemHJlhPUHw

Copy the token value.

Access to the Kubernetes dashboard using the kubectl proxy command line.

mehdi@MacBook-Pro: kubectl proxy
Starting to serve on 127.0.0.1:8001

The dashboard is now available in the following link: http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/#/login

Choose the token authentication and paste the value from the previous output.

You have now access to the Kubernetes dashboard and deployed your first Kubernetes resource!

Deploy an Ingress Load Balancer

In order to access your cluster service externally, we need to create an ingress load balancer for our GKE cluster. The ingress load balancer will make HTTP/HTTPS applications accessible publicly through the creation of an external IP address for the cluster.

Before creating the ingress, we need to deploy a test application for our example. Let’s deploy an NGINX server.

mehdi@MacBook-Pro: vi nginx-deployment.yaml

apiVersion: apps/v1beta2
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 2
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  type: NodePort
  ports:
    - port: 80
  selector:
    app: nginx


mehdi@MacBook-Pro: kubectl apply -f nginx-deployment.yaml

deployment.apps/nginx-deployment unchanged
service/nginx created

Create the ingress resource and deploy it as following:

mehdi@MacBook-Pro: vi basic-ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: basic-ingress
spec:
rules:
- http:
paths:
- backend:
serviceName: nginx
servicePort: 80

mehdi@MacBook-Pro: kubectl apply -f basic-ingress.yaml
ingress.extensions/basic-ingress created

Verify the status of the ingress:

mehdi@MacBook-Pro: kubectl get ing -o wide
NAME            HOSTS   ADDRESS         PORTS   AGE
basic-ingress   *       34.102.214.94   80      8h

The ingress resources have been properly created. We can see the result directly from the Google Cloud dashboard.

The NGINX service is now available via the Ingress Load Balancer and can be accessed through:

Cet article Create a Kubernetes cluster with Google Kubernetes Engine est apparu en premier sur Blog dbi services.

Handling PostgreSQL installations from packages

Sun, 2019-11-03 13:18
In this blog I will show how to handle a PostgreSQL installation with a customized PGDATA using the packages provided by the PostgreSQL community.

One issue with the packages is the hard coded PGDATA, which will be overwritten in the Servicefile with each update of PostgreSQL. This blog entry based on PostgreSQL 12 with CentOS 7 and CentOS 8.

On a minimal installation in my mind a few things are missing, the net-tools package and nano as editor, I’m a friend of using nano instead of vi.

CentOS 7:

$ yum install net-tools
$ yum install nano

CentOS 8:

$ dnf install net-tools
$ dnf install nano

For using the PostgreSQL repository it is important to exclude PostgreSQL from the CentOS Repository.

By using CentOS 7 you need to edit the CentOS-Base repofile to exclude PostgreSQL from Base and Updates.

$ nano /etc/yum.repos.d/CentOS-Base.repo

# CentOS-Base.repo
#
# The mirror system uses the connecting IP address of the client and the
# update status of each mirror to pick mirrors that are updated to and
# geographically close to the client.  You should use this for CentOS updates
# unless you are manually picking other mirrors.
#
# If the mirrorlist= does not work for you, as a fall back you can try the
# remarked out baseurl= line instead.
#
#

[base]
name=CentOS-$releasever - Base
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os&infra=$infra
#baseurl=http://mirror.centos.org/centos/$releasever/os/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
#exclude PostgreSQL from os repository 
exclude=postgresql* 

#released updates
[updates]
name=CentOS-$releasever - Updates
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=updates&infra=$inf$
#baseurl=http://mirror.centos.org/centos/$releasever/updates/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
#exclude PostgreSQL from os repository 
exclude=postgresql*

#additional packages that may be useful
[extras]
name=CentOS-$releasever - Extras
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=extras&infra=$infra
#baseurl=http://mirror.centos.org/centos/$releasever/extras/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7

#additional packages that extend functionality of existing packages
[centosplus]
name=CentOS-$releasever - Plus
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=centosplus&infra=$$
[ Read 46 lines ]

By using CentOS 8 it is just one command to exclude PostgreSQL from the distribution repository:

$ dnf -y module disable postgresql

Add PostgreSQL Repository to CentOS 7, in this example it is ProstgreSQL 12

$ yum install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

And the same for CentOS 8

$ dnf install https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm

Now it is time to install PostgreSQL 12 out of the PostgreSQL repository BUT NO INITDB at the moment.

CentOS 7:

$ yum install postgresql12 postgresql12-server postgresql12-contrib

CentOS 8:

$ dnf install postgresql12 postgresql12-server postgresql12-contrib

Now it is time to create the override file to the PostgreSQL Service file, the steps are identical on CentOS 7 and CentOS 8.

In my example PGDATA is in /pg_data/12/data mounted as own volume.

So edit the postgresql-12.service file with sysctl edit:

$ systemctl edit postgresql-12.service

And add the needed content for your customized PGDATA:

[Service]
Environment=PGDATA=/pg_data/12/data

Save the change, it will create a /etc/systemd/system/postgresql-12.service.d/override.conf file which will be merged with the original service file.

To check the content:

$ cat /etc/systemd/system/postgresql-12.service.d/override.conf
[Service]
Environment=PGDATA=/pg_data/12/data

Reload Systemd

$ systemctl daemon-reload

Hopefully your PGATA is owned by the postgres user if not make sure that it is:

$ chown -R postgres:postgres /pg_data/

Create the PostgreSQL instance as root user:

$ /usr/pgsql-12/bin/postgresql-12-setup initdb
Initializing database ... OK

Here it is:

[root@centos-8-blog /]# cd /pg_data/12/data/
[root@centos-8-blog data]# ls
base          pg_dynshmem    pg_multixact  pg_snapshots  pg_tblspc    pg_xact
global        pg_hba.conf    pg_notify     pg_stat       pg_twophase  postgresql.auto.conf
log           pg_ident.conf  pg_replslot   pg_stat_tmp   PG_VERSION   postgresql.conf
pg_commit_ts  pg_logical     pg_serial     pg_subtrans   pg_wal

From now on PostgreSQL minor updates will be done with yum update on CentOS 7 or dnf update on CentOS 8 in one step, no extra downtime for it.

But be careful, before running yum update or dnf update STOP ALL POSTGRESQL INSTANCES!

This is also working in environments with many instances, you need a service file and an override.conf for each instance, an additional instance needs to be created with initdb -D and not with PostgreSQL-12-setup initdb.

This method is also working with SLES 12.

 

Cet article Handling PostgreSQL installations from packages est apparu en premier sur Blog dbi services.

pg_auto_failover: Setup and installation

Fri, 2019-11-01 02:25

When I attended PGIBZ 2019 earlier this year, I talked with Dimitri about pg_auto_failover and I promised to have a look at it. Well, almost half a year later and after we’ve met again at pgconf.eu it is time to actually do that. You probably already know that citudata was acquired by Microsoft earlier this year and that Microsoft seems to be committed to open source since a few years. pg_auto_failover is one of the projects they contribute back to the PostgreSQL community. This will be a multi-blog series and in this very first post it is all about getting it up and running. In a following post we will then look at failover and switchover scenarios.

As usual, when you need auto failover you need at least three nodes and pg_auto_failover is no exception to that. The following graphic is stolen from the pg_auto_failover github page:

We have one PostgreSQL master, one PostgreSQL replica and in addition a monitoring host. In may case that maps to:

pg-af1.ti.dbi-services.com master 192.168.22.70 pg-af2.ti.dbi-services.com replica 192.168.22.71 pg-af3.ti.dbi-services.com monitor/cluster management 192.168.22.72

All of these nodes run CentOS 8 and I will be going from source code as that gives most flexibility. As pg_auto_failover depends on PostgreSQL (of course) the first step is to install PostgreSQL on all three nodes (PostgreSQL 12 in this setup). If you need further information on how to do that you can e.g. check here. Basically these steps have been executed on all the three nodes (given that the postgres user already exists and sudo is configured):

[postgres@pg-af1 ~]$ sudo dnf install -y gcc openldap-devel python36-devel readline-devel redhat-lsb bison flex perl-ExtUtils-Embed zlib-devel openssl-devel pam-devel libxml2-devel libxslt-devel openssh-clients bzip2 net-tools wget unzip sysstat xorg-x11-xauth systemd-devel bash-completion python36 policycoreutils-python-utils make git
[postgres@pg-af1 ~]$ wget https://ftp.postgresql.org/pub/source/v12.0/postgresql-12.0.tar.bz2
[postgres@pg-af1 ~]$ tar -axf postgresql-12.0.tar.bz2
[postgres@pg-af1 ~]$ cd postgresql-12.0
[postgres@pg-af1 postgresql-12.0]$ sudo mkdir -p /u01 /u02
[postgres@pg-af1 postgresql-12.0]$ sudo chown postgres:postgres /u01 /u02
[postgres@pg-af1 postgresql-12.0]$ PGHOME=/u01/app/postgres/product/12/db_0/
[postgres@pg-af1 postgresql-12.0]$ SEGSIZE=2
[postgres@pg-af1 postgresql-12.0]$ BLOCKSIZE=8
[postgres@pg-af1 postgresql-12.0]$ WALSEGSIZE=64
[postgres@pg-af1 postgresql-12.0]$ ./configure --prefix=${PGHOME} \
> --exec-prefix=${PGHOME} \
> --bindir=${PGHOME}/bin \
> --libdir=${PGHOME}/lib \
> --sysconfdir=${PGHOME}/etc \
> --includedir=${PGHOME}/include \
> --datarootdir=${PGHOME}/share \
> --datadir=${PGHOME}/share \
> --with-pgport=5432 \
> --with-perl \
> --with-python \
> --with-openssl \
> --with-pam \
> --with-ldap \
> --with-libxml \
> --with-libxslt \
> --with-segsize=${SEGSIZE} \
> --with-blocksize=${BLOCKSIZE} \
> --with-systemd \
> --with-extra-version=" dbi services build"
[postgres@pg-af1 postgresql-12.0]$ make all
[postgres@pg-af1 postgresql-12.0]$ make install
[postgres@pg-af1 postgresql-12.0]$ cd contrib
[postgres@pg-af1 contrib]$ make install
[postgres@pg-af1 contrib]$ cd ../..
[postgres@pg-af1 ~]$ rm -rf postgresql*

We will go for an installation from source code of pg_auto_failover as well (again, on all three nodes):

postgres@pg-af1:/home/postgres/ [pg120] git clone https://github.com/citusdata/pg_auto_failover.git
postgres@pg-af1:/home/postgres/ [pg120] cd pg_auto_failover/
postgres@pg-af1:/home/postgres/pg_auto_failover/ [pg120] make
postgres@pg-af1:/home/postgres/pg_auto_failover/ [pg120] make install
postgres@pg-af1:/home/postgres/pg_auto_failover/ [pg120] cd ..
postgres@pg-af1:/home/postgres/ [pg120] rm -rf pg_auto_failover/

That’s it, quite easy. What I like especially is, that there are no dependencies on python or any other libraries except for PostgreSQL. What the installation gives us is basically pg_autoctl:

postgres@pg-af1:/home/postgres/ [pg120] pg_autoctl --help
pg_autoctl: pg_auto_failover control tools and service
usage: pg_autoctl [ --verbose --quiet ]


Available commands:
pg_autoctl
+ create   Create a pg_auto_failover node, or formation
+ drop     Drop a pg_auto_failover node, or formation
+ config   Manages the pg_autoctl configuration
+ show     Show pg_auto_failover information
+ enable   Enable a feature on a formation
+ disable  Disable a feature on a formation
run      Run the pg_autoctl service (monitor or keeper)
stop     signal the pg_autoctl service for it to stop
reload   signal the pg_autoctl for it to reload its configuration
help     print help message
version  print pg_autoctl version

The first step in setting up the cluster is to initialize the monitoring node:

postgres@pg-af3:/home/postgres/ [pg120] pg_autoctl create --help
pg_autoctl create: Create a pg_auto_failover node, or formation

Available commands:
pg_autoctl create
monitor    Initialize a pg_auto_failover monitor node
postgres   Initialize a pg_auto_failover standalone postgres node
formation  Create a new formation on the pg_auto_failover monitor

postgres@pg-af3:/home/postgres/ [pg120] sudo mkdir -p /u02/pgdata
postgres@pg-af3:/home/postgres/ [pg120] sudo chown postgres:postgres /u02/pgdata
postgres@pg-af3:/home/postgres/ [pg120] unset PGDATABASE
postgres@pg-af3:/home/postgres/ [] pg_autoctl create monitor --pgdata /u02/pgdata/PG12/af
INFO  Initialising a PostgreSQL cluster at "/u02/pgdata/PG12/af"
INFO   /u01/app/postgres/product/12/db_0/bin/pg_ctl --pgdata /u02/pgdata/PG12/af --options "-p 5432" --options "-h *" --waitstart
INFO  Granting connection privileges on 192.168.22.0/24
INFO  Your pg_auto_failover monitor instance is now ready on port 5432.
INFO  pg_auto_failover monitor is ready at postgres://autoctl_node@pg-af3:5432/pg_auto_failover
INFO  Monitor has been succesfully initialized.

Once that succeeds you’ll a new PostgreSQL instance running and pg_auto_failover PostgreSQL background worker processes:

postgres@pg-af3:/home/postgres/ [af] ps -ef | grep "postgres:"
postgres  5958  5955  0 14:15 ?        00:00:00 postgres: checkpointer
postgres  5959  5955  0 14:15 ?        00:00:00 postgres: background writer
postgres  5960  5955  0 14:15 ?        00:00:00 postgres: walwriter
postgres  5961  5955  0 14:15 ?        00:00:00 postgres: autovacuum launcher
postgres  5962  5955  0 14:15 ?        00:00:00 postgres: stats collector
postgres  5963  5955  0 14:15 ?        00:00:00 postgres: pg_auto_failover monitor
postgres  5964  5955  0 14:15 ?        00:00:00 postgres: logical replication launcher
postgres  5965  5955  0 14:15 ?        00:00:00 postgres: pg_auto_failover monitor worker
postgres  5966  5955  0 14:15 ?        00:00:00 postgres: pg_auto_failover monitor worker

The initialization of the monitor node also created a new database and two roles:

postgres@pg-af3:/home/postgres/ [af] psql postgres
psql (12.0 dbi services build)
Type "help" for help.

postgres=# \l
List of databases
Name       |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges
------------------+----------+----------+-------------+-------------+-----------------------
pg_auto_failover | autoctl  | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
postgres         | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
template0        | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
                 |          |          |             |             | postgres=CTc/postgres
template1        | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
                 |          |          |             |             | postgres=CTc/postgres

postgres=# \du
List of roles
Role name   |                         Attributes                         | Member of
--------------+------------------------------------------------------------+-----------
autoctl      |                                                            | {}
autoctl_node |                                                            | {}
postgres     | Superuser, Create role, Create DB, Replication, Bypass RLS | {}

What we got in the new database is the pgautofailover extension:

pg_auto_failover=# \dx
List of installed extensions
Name      | Version |   Schema   |         Description
----------------+---------+------------+------------------------------
pgautofailover | 1.0     | public     | pg_auto_failover
plpgsql        | 1.0     | pg_catalog | PL/pgSQL procedural language
(2 rows)

For our management kit to work properly a few PostgreSQL parameters will be set:

pg_auto_failover=# alter system set log_truncate_on_rotation = 'on';
ALTER SYSTEM
pg_auto_failover=# alter system set log_filename = 'postgresql-%a.log';
ALTER SYSTEM
pg_auto_failover=# alter system set log_rotation_age = '1440';
ALTER SYSTEM
pg_auto_failover=# alter system set log_line_prefix = '%m - %l - %p - %h - %u@%d - %x';
ALTER SYSTEM
pg_auto_failover=# alter system set log_directory = 'pg_log';
ALTER SYSTEM
pg_auto_failover=# alter system set log_min_messages = 'WARNING';
ALTER SYSTEM
pg_auto_failover=# alter system set log_autovacuum_min_duration = '60s';
ALTER SYSTEM
pg_auto_failover=# alter system set log_min_error_statement = 'NOTICE';
ALTER SYSTEM
pg_auto_failover=# alter system set log_min_duration_statement = '30s';
ALTER SYSTEM
pg_auto_failover=# alter system set log_checkpoints = 'on';
ALTER SYSTEM
pg_auto_failover=# alter system set log_statement = 'ddl';
ALTER SYSTEM
pg_auto_failover=# alter system set log_lock_waits = 'on';
ALTER SYSTEM
pg_auto_failover=# alter system set log_temp_files = '0';
ALTER SYSTEM
pg_auto_failover=# alter system set log_timezone = 'Europe/Zurich';
ALTER SYSTEM
pg_auto_failover=# alter system set log_connections=on;
ALTER SYSTEM
pg_auto_failover=# alter system set log_disconnections=on;
ALTER SYSTEM
pg_auto_failover=# alter system set log_duration=on;
ALTER SYSTEM
pg_auto_failover=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

What we need for the other nodes is the connection string to the monitoring node:

postgres@pg-af3:/home/postgres/ [af] pg_autoctl show uri
postgres://autoctl_node@pg-af3:5432/pg_auto_failover

Once we have that we can proceed with creating the master instance on the first host:

postgres@pg-af1:/home/postgres/ [pg120] unset PGDATABASE
postgres@pg-af1:/home/postgres/ [] sudo mkdir /u02/pgdata
postgres@pg-af1:/home/postgres/ [] sudo chown postgres:postgres /u02/pgdata
postgres@pg-af1:/home/postgres/ [] pg_autoctl create postgres --pgdata /u02/pgdata/12/PG1 --nodename pg-af1.it.dbi-services.com --monitor postgres://autoctl_node@pg-af3:5432/pg_auto_failover
INFO  Found pg_ctl for PostgreSQL 12.0 at /u01/app/postgres/product/12/db_0/bin/pg_ctl
INFO  Registered node pg-af1.it.dbi-services.com:5432 with id 1 in formation "default", group 0.
INFO  Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/u02/pgdata/12/PG1/pg_autoctl.init"
INFO  Successfully registered as "single" to the monitor.
INFO  Initialising a PostgreSQL cluster at "/u02/pgdata/12/PG1"
INFO  Postgres is not running, starting postgres
INFO   /u01/app/postgres/product/12/db_0/bin/pg_ctl --pgdata /u02/pgdata/12/PG1 --options "-p 5432" --options "-h *" --wait start
INFO  The user "postgres" already exists, skipping.
INFO  CREATE DATABASE postgres;
INFO  The database "postgres" already exists, skipping.
INFO  FSM transition from "init" to "single": Start as a single node
INFO  Initialising postgres as a primary
INFO  Transition complete: current state is now "single"
INFO  Keeper has been succesfully initialized.

Once the master if up bring up the replica on the second node:

postgres@pg-af2:/home/postgres/ [pg120] pg_autoctl create postgres --pgdata /u02/pgdata/12/PG1 --nodename pg-af2.it.dbi-services.com --monitor postgres://autoctl_node@pg-af3:5432/pg_auto_failover
17:11:42 INFO  Registered node pg-af2.it.dbi-services.com:5432 with id 2 in formation "default", group 0.
17:11:42 INFO  Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/u02/pgdata/12/PG1/pg_autoctl.init"
17:11:42 INFO  Successfully registered as "wait_standby" to the monitor.
17:11:42 INFO  FSM transition from "init" to "wait_standby": Start following a primary
17:11:42 INFO  Transition complete: current state is now "wait_standby"
17:11:47 INFO  FSM transition from "wait_standby" to "catchingup": The primary is now ready to accept a standby
17:11:47 INFO  The primary node returned by the monitor is pg-af1.it.dbi-services.com:5432
17:11:47 INFO  Initialising PostgreSQL as a hot standby
17:11:47 INFO  Running /u01/app/postgres/product/12/db_0/bin/pg_basebackup -w -h pg-af1.it.dbi-services.com -p 5432 --pgdata /u02/pgdata/12/backup -U pgautofailover_replicator --write-recovery-conf --max-rate 100M --wal-method=stream --slot pgautofailover_standby ...
17:11:49 INFO  pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1
pg_basebackup: starting background WAL receiver
0/23699 kB (0%), 0/1 tablespace (/u02/pgdata/12/backup/backup_label )
136/23699 kB (0%), 0/1 tablespace (/u02/pgdata/12/backup/global/4184  )
23708/23708 kB (100%), 0/1 tablespace (...data/12/backup/global/pg_control)
23708/23708 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/2000100
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: base backup completed

17:11:49 INFO  Postgres is not running, starting postgres
17:11:49 INFO   /u01/app/postgres/product/12/db_0/bin/pg_ctl --pgdata /u02/pgdata/12/PG1 --options "-p 5432" --options "-h *" --wait start
17:11:50 INFO  PostgreSQL started on port 5432
17:11:50 INFO  Transition complete: current state is now "catchingup"
17:11:50 INFO  Keeper has been succesfully initialized.

The next step is to start the so called keeper process (this is the process which communicates with the montoring node about state changes):

postgres@pg-af1:/home/postgres/ [] pg_autoctl run --pgdata /u02/pgdata/12/PG1
INFO  Managing PostgreSQL installation at "/u02/pgdata/12/PG1"
INFO  The version of extenstion "pgautofailover" is "1.0" on the monitor
INFO  pg_autoctl service is starting
INFO  Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", current lsn is "0/0".
INFO  Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", current lsn is "0/0".
INFO  Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", current lsn is "0/0".
INFO  Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", current lsn is "0/0".

To integrate that into systemd:

postgres@pg-af2:/home/postgres/ [PG1] pg_autoctl show systemd
20:28:43 INFO  HINT: to complete a systemd integration, run the following commands:
20:28:43 INFO  pg_autoctl -q show systemd --pgdata "/u02/pgdata/12/PG1" | sudo tee /etc/systemd/system/pgautofailover.service
20:28:43 INFO  sudo systemctl daemon-reload
20:28:43 INFO  sudo systemctl start pgautofailover
[Unit]
Description = pg_auto_failover

[Service]
WorkingDirectory = /u02/pgdata/12/PG1
Environment = 'PGDATA=/u02/pgdata/12/PG1'
User = postgres
ExecStart = /u01/app/postgres/product/12/db_0/bin/pg_autoctl run
Restart = always
StartLimitBurst = 0

[Install]
WantedBy = multi-user.target

postgres@pg-af2:/home/postgres/ [PG1] pg_autoctl -q show systemd --pgdata "/u02/pgdata/12/PG1" | sudo tee /etc/systemd/system/pgautofailover.service
[Unit]
Description = pg_auto_failover

[Service]
WorkingDirectory = /u02/pgdata/12/PG1
Environment = 'PGDATA=/u02/pgdata/12/PG1'
User = postgres
ExecStart = /u01/app/postgres/product/12/db_0/bin/pg_autoctl run
Restart = always
StartLimitBurst = 0

[Install]
WantedBy = multi-user.target

postgres@pg-af2:/home/postgres/ [PG1] systemctl list-unit-files | grep pgauto
pgautofailover.service                      disabled
20:30:57 postgres@pg-af2:/home/postgres/ [PG1] sudo systemctl enable pgautofailover.service
Created symlink /etc/systemd/system/multi-user.target.wants/pgautofailover.service → /etc/systemd/system/pgautofailover.service.

If you are on CentOS/Red Hat 8 you will also need this as otherwise the service will not start:

postgres@pg-af1:/u01/app/postgres/local/dmk/ [PG1] sudo semanage fcontext -a -t bin_t /u01/app/postgres/product/12/db_0/bin/pg_autoctl
postgres@pg-af1:/u01/app/postgres/local/dmk/ [PG1] restorecon -v /u01/app/postgres/product/12/db_0/bin/pg_autoctl

After rebooting all the nodes (to confirm that the systemd service is working as expected) the state of the cluster reports one primary and a secondary/replica as expected:

postgres@pg-af3:/home/postgres/ [af] pg_autoctl show state
Name |   Port | Group |  Node |     Current State |    Assigned State
---------------------------+--------+-------+-------+-------------------+------------------
pg-af1.it.dbi-services.com |   5432 |     0 |     1 |           primary |           primary
pg-af2.it.dbi-services.com |   5432 |     0 |     2 |         secondary |         secondary

The various states are documented here.

Remember: As this is based on PostgreSQL 12 there will be no recovery.conf on the replica. The replication parameters have been added to postgresql.auto.conf automatically:

postgres@pg-af2:/u02/pgdata/12/PG1/ [PG1] cat postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
primary_conninfo = 'user=pgautofailover_replicator passfile=''/home/postgres/.pgpass'' connect_timeout=5 host=''pg-af1.it.dbi-services.com'' port=5432 sslmode=prefer sslcompression=0 gssencmode=disable target_session_attrs=any'
primary_slot_name = 'pgautofailover_standby'

That’s it for the setup. Really easy and simple, I like it. In the next post we’ll have a look at controlled switch-overs and fail-over scenarios.

Cet article pg_auto_failover: Setup and installation est apparu en premier sur Blog dbi services.

AEM Forms – Certify PDF end-up with NoSuchMethodError on bouncycastle

Thu, 2019-10-31 13:30

As part of an AEM project, we were working on setting up a few actions on PDF files. One of these actions was to Sign & Certify a PDF file. The basic Sign & Certify action provided by AEM is working easily by default but if you look deeper, you might get some surprise. The complexity in this case came from the fact that we absolutely needed the signature to contain a valid Time-Stamp using the Time-Stamp Protocol (TSP) as well as a valid Long-Term Validation (LTV). In this blog, I will talk about one (of the numerous) issue we faced that I believe is related only to AEM on WebLogic.

As I mentioned above, the basic Certify operation is working easily but if you do not take a closer look, it might not be TSP and/or LTV. In our case, using AEM 6.4 SP3 on WebLogic Server 12.2.1.3, we got the Certify operation to work but without TSP & LTV:

Certify PDF - TSP failed & LTV failed

Looking at the AEM Managed Server logs, you can see that the last line is an error message:

####<Aug 28, 2019 12:15:22,278 PM UTC> <Info> <com.adobe.livecycle.usermanager.sslauthprovider.SSLMutualAuthProvider> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '16' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-129013562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000055> <1566994522278> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Got Subject DN as CN=aem-dev,OU=IT,O=dbi services,L=Delemont,ST=Jura,C=CH>
####<Aug 28, 2019 12:15:25,025 PM UTC> <Info> <com.adobe.livecycle.usermanager.sslauthprovider.SSLMutualAuthProvider> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-129513562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994525025> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Got Subject DN as CN=aem-dev,OU=IT,O=dbi services,L=Delemont,ST=Jura,C=CH>
####<Aug 28, 2019 12:15:25,680 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994525680> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC008: Using the database to access and persist configuration properties.>
####<Aug 28, 2019 12:15:25,681 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994525681> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC001: The property LastCacheResetTime has been changed from  to 1555070921173>
####<Aug 28, 2019 12:15:25,681 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994525681> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC001: The property CacheValidationTime has been changed from 0 to 1555070921058>
####<Aug 28, 2019 12:15:25,684 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994525684> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 12:15:26,130 PM UTC> <Info> <Common> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-130E13562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994526130> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000628> <Created "1" resources for pool "IDP_DS", out of which "1" are available and "0" are unavailable.>
####<Aug 28, 2019 12:15:26,141 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994526141> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 12:15:26,147 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994526147> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 12:15:26,153 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994526153> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 12:15:26,158 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994526158> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 12:15:26,571 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994526571> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC008: Using the database to access and persist configuration properties.>
####<Aug 28, 2019 12:15:27,835 PM UTC> <Info> <com.adobe.livecycle.usermanager.sslauthprovider.SSLMutualAuthProvider> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '60' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-13A613562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000057> <1566994527835> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Got Subject DN as CN=aem-dev,OU=IT,O=dbi services,L=Delemont,ST=Jura,C=CH>
####<Aug 28, 2019 12:15:30,923 PM UTC> <Error> <com.adobe.workflow.AWS> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994530923> <[severity-value: 8] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <An exception was thrown with name java.lang.NoSuchMethodError message:org.bouncycastle.asn1.x509.AlgorithmIdentifier.getObjectId()Lorg/bouncycastle/asn1/ASN1ObjectIdentifier; while invoking service SignatureService and operation certify and no fault routes were found to be configured.>

 

At the same time, we also got this kind of messages:

ALC-DSC-003-000: com.adobe.idp.dsc.DSCInvocationException: Invocation error.
            at com.adobe.idp.dsc.component.impl.DefaultPOJOInvokerImpl.invoke(DefaultPOJOInvokerImpl.java:152)
            at com.adobe.idp.dsc.interceptor.impl.InvocationInterceptor.intercept(InvocationInterceptor.java:140)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.DocumentPassivationInterceptor.intercept(DocumentPassivationInterceptor.java:53)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor$1.doInTransaction(TransactionInterceptor.java:74)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.execute(EjbTransactionCMTAdapterBean.java:357)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.doRequired(EjbTransactionCMTAdapterBean.java:274)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.__WL_invoke(Unknown Source)
            at weblogic.ejb.container.internal.SessionLocalMethodInvoker.invoke(SessionLocalMethodInvoker.java:33)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.doRequired(Unknown Source)
            at com.adobe.idp.dsc.transaction.impl.ejb.EjbTransactionProvider.execute(EjbTransactionProvider.java:129)
            at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor.intercept(TransactionInterceptor.java:72)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.InvocationStrategyInterceptor.intercept(InvocationStrategyInterceptor.java:55)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.InvalidStateInterceptor.intercept(InvalidStateInterceptor.java:37)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.AuthorizationInterceptor.intercept(AuthorizationInterceptor.java:188)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.JMXInterceptor.intercept(JMXInterceptor.java:48)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.engine.impl.ServiceEngineImpl.invoke(ServiceEngineImpl.java:121)
            at com.adobe.idp.dsc.routing.Router.routeRequest(Router.java:131)
            at com.adobe.idp.dsc.provider.impl.base.AbstractMessageReceiver.routeMessage(AbstractMessageReceiver.java:93)
            at com.adobe.idp.dsc.provider.impl.vm.VMMessageDispatcher.doSend(VMMessageDispatcher.java:225)
            at com.adobe.idp.dsc.provider.impl.base.AbstractMessageDispatcher.send(AbstractMessageDispatcher.java:69)
            at com.adobe.idp.dsc.clientsdk.ServiceClient.invoke(ServiceClient.java:215)
            at com.adobe.workflow.engine.PEUtil.invokeAction(PEUtil.java:893)
            at com.adobe.idp.workflow.dsc.invoker.WorkflowDSCInvoker.transientInvoke(WorkflowDSCInvoker.java:356)
            at com.adobe.idp.workflow.dsc.invoker.WorkflowDSCInvoker.invoke(WorkflowDSCInvoker.java:159)
            at com.adobe.idp.dsc.interceptor.impl.InvocationInterceptor.intercept(InvocationInterceptor.java:140)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.DocumentPassivationInterceptor.intercept(DocumentPassivationInterceptor.java:53)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor$1.doInTransaction(TransactionInterceptor.java:74)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.execute(EjbTransactionCMTAdapterBean.java:357)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.doRequiresNew(EjbTransactionCMTAdapterBean.java:299)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.__WL_invoke(Unknown Source)
            at weblogic.ejb.container.internal.SessionLocalMethodInvoker.invoke(SessionLocalMethodInvoker.java:33)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.doRequiresNew(Unknown Source)
            at com.adobe.idp.dsc.transaction.impl.ejb.EjbTransactionProvider.execute(EjbTransactionProvider.java:143)
            at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor.intercept(TransactionInterceptor.java:72)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.InvocationStrategyInterceptor.intercept(InvocationStrategyInterceptor.java:55)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.InvalidStateInterceptor.intercept(InvalidStateInterceptor.java:37)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.AuthorizationInterceptor.intercept(AuthorizationInterceptor.java:188)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.JMXInterceptor.intercept(JMXInterceptor.java:48)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.engine.impl.ServiceEngineImpl.invoke(ServiceEngineImpl.java:121)
            at com.adobe.idp.dsc.routing.Router.routeRequest(Router.java:131)
            at com.adobe.idp.dsc.provider.impl.base.AbstractMessageReceiver.invoke(AbstractMessageReceiver.java:329)
            at com.adobe.idp.dsc.provider.impl.soap.axis.sdk.SoapSdkEndpoint.invokeCall(SoapSdkEndpoint.java:153)
            at com.adobe.idp.dsc.provider.impl.soap.axis.sdk.SoapSdkEndpoint.invoke(SoapSdkEndpoint.java:91)
            at sun.reflect.GeneratedMethodAccessor621.invoke(Unknown Source)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:498)
            at org.apache.axis.providers.java.RPCProvider.invokeMethod(RPCProvider.java:397)
            at org.apache.axis.providers.java.RPCProvider.processMessage(RPCProvider.java:186)
            at org.apache.axis.providers.java.JavaProvider.invoke(JavaProvider.java:323)
            at org.apache.axis.strategies.InvocationStrategy.visit(InvocationStrategy.java:32)
            at org.apache.axis.SimpleChain.doVisiting(SimpleChain.java:118)
            at org.apache.axis.SimpleChain.invoke(SimpleChain.java:83)
            at org.apache.axis.handlers.soap.SOAPService.invoke(SOAPService.java:454)
            at org.apache.axis.server.AxisServer.invoke(AxisServer.java:281)
            at org.apache.axis.transport.http.AxisServlet.doPost(AxisServlet.java:699)
            at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
            at org.apache.axis.transport.http.AxisServletBase.service(AxisServletBase.java:327)
            at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
            at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:286)
            at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:260)
            at weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:137)
            at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:350)
            at weblogic.servlet.internal.TailFilter.doFilter(TailFilter.java:25)
            at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
            at com.adobe.idp.dsc.provider.impl.soap.axis.InvocationFilter.doFilter(InvocationFilter.java:43)
            at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
            at com.adobe.idp.um.auth.filter.ParameterFilter.doFilter(ParameterFilter.java:105)
            at com.adobe.idp.um.auth.filter.CSRFFilter.invokeNextFilter(CSRFFilter.java:141)
            at com.adobe.idp.um.auth.filter.CSRFFilter.doFilter(CSRFFilter.java:132)
            at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
            at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.wrapRun(WebAppServletContext.java:3706)
            at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3672)
            at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:328)
            at weblogic.security.service.SecurityManager.runAsForUserCode(SecurityManager.java:197)
            at weblogic.servlet.provider.WlsSecurityProvider.runAsForUserCode(WlsSecurityProvider.java:203)
            at weblogic.servlet.provider.WlsSubjectHandle.run(WlsSubjectHandle.java:71)
            at weblogic.servlet.internal.WebAppServletContext.doSecuredExecute(WebAppServletContext.java:2443)
            at weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2291)
            at weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2269)
            at weblogic.servlet.internal.ServletRequestImpl.runInternal(ServletRequestImpl.java:1705)
            at weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1665)
            at weblogic.servlet.provider.ContainerSupportProviderImpl$WlsRequestExecutor.run(ContainerSupportProviderImpl.java:272)
            at weblogic.invocation.ComponentInvocationContextManager._runAs(ComponentInvocationContextManager.java:352)
            at weblogic.invocation.ComponentInvocationContextManager.runAs(ComponentInvocationContextManager.java:337)
            at weblogic.work.LivePartitionUtility.doRunWorkUnderContext(LivePartitionUtility.java:57)
            at weblogic.work.PartitionUtility.runWorkUnderContext(PartitionUtility.java:41)
            at weblogic.work.SelfTuningWorkManagerImpl.runWorkUnderContext(SelfTuningWorkManagerImpl.java:652)
            at weblogic.work.ExecuteThread.execute(ExecuteThread.java:420)
            at weblogic.work.ExecuteThread.run(ExecuteThread.java:360)
Caused by: java.lang.NoSuchMethodError: org.bouncycastle.asn1.x509.AlgorithmIdentifier.getObjectId()Lorg/bouncycastle/asn1/ASN1ObjectIdentifier;
            at com.adobe.livecycle.signatures.pki.timestamp.TimestampInfoBC.matchesMessageImprint(TimestampInfoBC.java:187)
            at com.adobe.livecycle.signatures.pki.timestamp.TimestampToken.validateRequest(TimestampToken.java:430)
            at com.adobe.livecycle.signatures.pki.impl.PKIOperations.createTimestamp(PKIOperations.java:562)
            at com.adobe.livecycle.signatures.service.impl.TimeStampProviderImpl.getTimestampToken(TimeStampProviderImpl.java:85)
            at com.adobe.idp.cryptoprovider.LCPKCS7Signer$1.getActualAttributes(LCPKCS7Signer.java:256)
            at com.adobe.livecycle.signatures.pki.signature.CMSPKCS7Impl.sign(CMSPKCS7Impl.java:702)
            at com.adobe.livecycle.signatures.pki.impl.PKIOperations.sign(PKIOperations.java:345)
            at com.adobe.livecycle.signatures.service.cryptoprovider.DSSPKCS7Signer.signData(DSSPKCS7Signer.java:84)
            at com.adobe.idp.cryptoprovider.LCPKCS7Signer.sign(LCPKCS7Signer.java:123)
            at com.adobe.internal.pdftoolkit.services.digsig.digsigframework.impl.SignatureHandlerPPKLite.writeSignatureAfterSave(SignatureHandlerPPKLite.java:816)
            at com.adobe.internal.pdftoolkit.services.digsig.impl.SigningUtils.doSigning(SigningUtils.java:820)
            at com.adobe.internal.pdftoolkit.services.digsig.SignatureManager.certifyWrapperAPI(SignatureManager.java:1554)
            at com.adobe.internal.pdftoolkit.services.digsig.SignatureManager.certify(SignatureManager.java:1542)
            at com.adobe.livecycle.signatures.service.impl.SignCertifyImpl.certify(SignCertifyImpl.java:894)
            at com.adobe.livecycle.signatures.service.impl.DocumentSecurityService.certify(DocumentSecurityService.java:1644)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:498)
            at com.adobe.idp.dsc.component.impl.DefaultPOJOInvokerImpl.invoke(DefaultPOJOInvokerImpl.java:118)
            ... 102 more

 

Based on the above messages, it is clear that there is a problem with some of the bouncycastle classes. This kind of thing is usually a missing class (“ClassNotFoundException“) or a conflict between two or more versions that are loaded by WebLogic (“NoSuchMethodError“) with the loaded/active version not containing the specific java method that is being called. We opened a SR with the Adobe Support (#188938) because this kind of thing shouldn’t be happening but after a few days without any meaningful update from them, I decided to look into the product myself to stop losing time on such trivial thing.

So this specific class (“org.bouncycastle.asn1.x509.AlgorithmIdentifier“) can be found in numerous jar files: apacheds*.jar, bcprov*.jar, bouncycastle*.jar, ec2*.jar, aso… I checked all these jar files on our WebLogic Server libraries as well as AEM ones and found what I believe was the issue: different versions of these jars being loaded. To confirm and before changing anything, I deployed the WebLogic CAT and found:

  • 0 conflicts in adobe-livecycle-cq-author.ear
  • 0 conflicts in adobe-livecycle-native-weblogic-x86_linux.ear
  • 5339 conflicts in adobe-livecycle-weblogic.ear

 
These numbers pretty much confirmed what I thought already. Going further, I found a few hundred conflicts related to the “org.bouncycastle.*” classes only. One of these being for the class “org.bouncycastle.asn1.x509.AlgorithmIdentifier” and it was conflicting between the following files:

  • WebLogic: $MW_HOME/oracle_common/modules/org.bouncycastle.bcprov-jdk15on.jar (1st loaded)
  • WebLogic: $MW_HOME/oracle_common/modules/org.bouncycastle.bcprov-ext-jdk15on.jar
  • AEM: $APPLICATIONS/adobe-livecycle-weblogic.ear/bcprov-151.jar

 
So what should be done to fix this? Well, a simple solution is just to force WebLogic to use the AEM provided files by default by updating the load preferences:

[weblogic@aem-node-1 ~]$ cd $APPLICATIONS
[weblogic@aem-node-1 AEM]$ 
[weblogic@aem-node-1 AEM]$ jar -xvf adobe-livecycle-weblogic.ear META-INF/weblogic-application.xml
[weblogic@aem-node-1 AEM]$ 
[weblogic@aem-node-1 AEM]$ grep -B1 "</prefer-application-packages>" META-INF/weblogic-application.xml
<package-name>org.mozilla.javascript.xmlimpl.*</package-name>
</prefer-application-packages>
[weblogic@aem-node-1 AEM]$ 
[weblogic@aem-node-1 AEM]$ sed -i 's,</prefer-application-packages>,<package-name>org.bouncycastle.*</package-name>\n&,' META-INF/weblogic-application.xml
[weblogic@aem-node-1 AEM]$ 
[weblogic@aem-node-1 AEM]$ grep -B2 "</prefer-application-packages>" META-INF/weblogic-application.xml
<package-name>org.mozilla.javascript.xmlimpl.*</package-name>
<package-name>org.bouncycastle.*</package-name>
</prefer-application-packages>
[weblogic@aem-node-1 AEM]$ 
[weblogic@aem-node-1 AEM]$ jar -uvf adobe-livecycle-weblogic.ear META-INF/weblogic-application.xml
[weblogic@aem-node-1 AEM]$ rm -rf META-INF
[weblogic@aem-node-1 AEM]$

 

What the above commands are doing is simply to add “<package-name>org.bouncycastle.*</package-name>” just before the end of the “<prefer-application-packages>” section so that WebLogic will know that it needs to use the AEM provided classes for this package and it shouldn’t use its own files. Once that is done, simply redeploy the EAR file. In my case, I was left with “only” 2442 conflicts, none regarding the bouncycastle (obviously).

After that, executing the same Certify action with the new classloader preferences resulted in no more errors:

####<Aug 28, 2019 1:12:22,359 PM UTC> <Info> <com.adobe.livecycle.usermanager.sslauthprovider.SSLMutualAuthProvider> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '109' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-1475E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000071> <1566997942359> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Got Subject DN as CN=aem-dev,OU=IT,O=dbi services,L=Delemont,ST=Jura,C=CH>
####<Aug 28, 2019 1:12:23,702 PM UTC> <Info> <com.adobe.livecycle.usermanager.sslauthprovider.SSLMutualAuthProvider> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-147BE729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997943702> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Got Subject DN as CN=aem-dev,OU=IT,O=dbi services,L=Delemont,ST=Jura,C=CH>
####<Aug 28, 2019 1:12:24,199 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944199> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC008: Using the database to access and persist configuration properties.>
####<Aug 28, 2019 1:12:24,199 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944199> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC001: The property LastCacheResetTime has been changed from  to 1555070921173>
####<Aug 28, 2019 1:12:24,200 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944200> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC001: The property CacheValidationTime has been changed from 0 to 1555070921058>
####<Aug 28, 2019 1:12:24,202 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944202> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 1:12:24,691 PM UTC> <Info> <Common> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14F2E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944691> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000628> <Created "1" resources for pool "IDP_DS", out of which "1" are available and "0" are unavailable.>
####<Aug 28, 2019 1:12:24,704 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944704> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 1:12:24,710 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944710> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 1:12:24,717 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944717> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 1:12:24,724 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944724> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 1:12:24,928 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944928> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC008: Using the database to access and persist configuration properties.>

 

The generated PDF now contained a correct Time-Stamp information but still not LTV information:

Certify PDF - TSP working & LTV failed

Finally, adding a Validation step after the Certify step in the process (in the AEM Application (LCA)) allowed both TSP and LTV information to be shown properly:

Certify PDF - TSP working & LTV working

Cet article AEM Forms – Certify PDF end-up with NoSuchMethodError on bouncycastle est apparu en premier sur Blog dbi services.

AEM Forms – WebLogic Clustering synch issue for Workbench 2-way-SSL

Thu, 2019-10-31 13:00

In a previous blog, I described the process to setup the AEM Forms to allow the AEM Workbench to connect to AEM using “2-way-SSL”. This setup is normally independent of the Application Server that you are using to host AEM. However, I already faced an issue (other than this one) which was caused by the 2-way-SSL setup for the Workbench in case of a WebLogic Cluster has been used to host AEM.

As mentioned in previous blog, I’m not an AEM expert but I know a few things about WebLogic so the idea here was to setup a fully functional WebLogic Cluster composed of two Managed Servers on two hosts/machines, test it properly and then install the AEM Forms application on top of it. Obviously, AEM Forms has been configured behind a Load Balancer for this purpose. At this point, AEM Forms was working perfectly in HA and stopping one of the nodes wasn’t a problem.

Then I configured the Workbench for 2-way-SSL and I did so while being connected to the AEM Node1 in Workbench, creating the Hybrid Domain in the AEM AdminUI Node1, aso… At the end of the setup, the AEM Workbench was working properly with the 2-way-SSL setup as well so it looked like the setup was completed. Just to be sure, I stopped the AEM Node1 and try to login to the AEM Workbench with the exact same parameters (same keystore, same truststore, same passwords) except for the target Server which I switched to the AEM Node2. Doing so, the login failed and I could see in the AEM Node2 Managed Server logs the following message:

####<Feb 12, 2019 2:14:46,277 PM UTC> <Info> <EJB> <aem-node-2> <msAEM-02> <[ACTIVE] ExecuteThread: '76' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <81fe4dac-31f0-4c25-bf37-17d5b327a901-0000005e> <1549980886277> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-010227> <EJB exception occurred during invocation from home or business: com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionBMTAdapter_fw85em_Intf generated exception: ALC-DSC-124-000: com.adobe.idp.dsc.DSCAuthorizationException: User does not have the Service Read Permission.>

 
Just looking at this message, it’s clear that the user account that is working properly for the AEM Node1 isn’t working for the AEM Node2. After some investigation, it looked like the Hybrid Domain wasn’t shown on the AEM AdminUI Node2, for some reason… Both nodes are using the same Oracle Database and the same GDS (Global Document Storage) path so I thought that the issue might be related to a cache somewhere in AEM. Therefore, I thought about re-creating the Hybrid Domain but I just cancelled this move right away because I assume it could have bring me more trouble than solution (I didn’t want to create 2 objects with the same name, avoid corruption or whatever…):

  • Open the AEM AdminUI Node2 (Node1 is still down) (https://<AEM_HOST_2>:<AEM_PORT>/adminui)
  • Login with an administative account (E.g.: administrator)
  • Navigate to: Settings > User Management > Domain Management
    • -> Only 1 domain is displayed, the default one: DefaultDom
  • Click on “New Hybrid Domain
    • Click on “Cancel”

 
After doing that, the Hybrid Domain (the one created in this blog, named “SSLMutualAuthProvider“) magically appeared so I assume that it forced a synchronization and an update of the cache on the AEM Node2. Trying again a login to the AEM Workbench without changing the parameters printed the following on the AEM Node2 Managed Server logs:

####<Feb 12, 2019 2:30:43,208 PM UTC> <Info> <com.adobe.livecycle.usermanager.sslauthprovider.SSLMutualAuthProvider> <aem-node-2> <msAEM-02> <[ACTIVE] ExecuteThread: '117' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-24A18C6CA9D79C032EFA> <81fe4dac-31f0-4c25-bf37-17d5b327a901-00000067> <1549981843208> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Got Subject DN as CN=aem-dev,OU=IT,O=dbi services,L=Delemont,ST=Jura,C=CH>

 
The above message means that the login is successful and Workbench is able to load the data from AEM properly. I guess that there are other ways to fix this issue. There is a “Sync Now” as well as a “Refresh” button on the Domain Management page of the AdminUI so maybe this would have done the same thing and forced a synchronization… I must admit that I first thought about re-creating the Hybrid Domain but cancelled that and since it solved my issue, I couldn’t test more, unfortunately. A restart of the AEM Node2 is also sufficient to force a refresh but this takes a few minutes and it requires a downtime so it’s not ideal.

Cet article AEM Forms – WebLogic Clustering synch issue for Workbench 2-way-SSL est apparu en premier sur Blog dbi services.

SLES15 SP1 – New features

Thu, 2019-10-31 05:20

With SLES15 SUSE introduced the Multimodal OS and the unified installer. Means, you only get what you really need. Your OS is flexible and you can easily add features you need and remove them as well. But this article shouldn’t be an explanation of the multimodal OS, it will show you some of the new features of SLES15 SP1.

SUSE supports the migration from SLES15 to SLES15 SP1 in online mode.
You can upgrade using two possibilities, YaST migration (GUI) and Zypper migration (command line).
Be sure that your system is registered at the SUSE Customer Center or has a local RMT server. Afterwards, just use “zypper migration”, type the number of the product you want to migrate and accept the terms of the license. That’s it.
The best way to check, if the installation was successful.

sles15:~ # cat /etc/os-release | grep PRETTY_NAME
PRETTY_NAME="SUSE Linux Enterprise Server 15 SP1"

So let’s have a look at the new features and improvements of SLES15 SP1 .

Unified Installer

SUSE Manager Server and Proxy are now available as base products. Both can be installed using the unified installer.
Point of Service and SLE Real Time are also included in the unified installer now.

Transactional Update

In OpenSUSE Leap and SUSE CaaS transactional update was already implemented, now it is also possible to run transactional updates with SLE. To install transactional update, the Transactional Server Module needs to get activated first (no additional key is needed). Afterwards the transactional-update package and its dependencies can be installed.

sle15:~ #  SUSEConnect --product sle-module-transactional-server/15.1/x86_64
Registering system to SUSE Customer Center

Updating system details on https://scc.suse.com ...

Activating sle-module-transactional-server 15.1 x86_64 ...
-> Adding service to system ...
-> Installing release package ...

Successfully registered system
sle15:~ # zypper install transactional-update
Refreshing service 'Basesystem_Module_15_SP1_x86_64'.
Refreshing service 'SUSE_Linux_Enterprise_Server_15_SP1_x86_64'.
Refreshing service 'Server_Applications_Module_15_SP1_x86_64'.
Refreshing service 'Transactional_Server_Module_15_SP1_x86_64'.
Loading repository data...
Reading installed packages...
Resolving package dependencies...

The following 6 NEW packages are going to be installed:
  attr bc openslp psmisc rsync transactional-update

6 new packages to install.
Overall download size: 686.6 KiB. Already cached: 0 B. After the operation, additional 1.3 MiB will be used.
Continue? [y/n/v/...? shows all options] (y): y

As you maybe know, SUSE uses btrfs with snapper as default for the file systems. This builds the basis for the transactional updates. Transactional updates are applied into a new snapshot, so the running system is not touched. Using this technology, the updated snapshot will be activated after the next reboot. So this is an update, that is
– Atomic: either fully applied or not.
– Easily revertabled: after a failed update the return to the previous (running) system is easy.

Simplified btrfs layout

There is only one single subvolume under /var not 10 for simplified and consistens snapshots. This takes only effect for fresh installations. Upgraded systems still use the old layout.
Startings with SLES15 SP1 there is also the possibility to have each home-directory as single subvolume. But this is not the default.

Secure encrypted virtualization (SEV)

Data encryption is a important topic in todays IT environments. Data stored on disk is widley encrypted, but how about the data in RAM? AMD’s SEV gives the opportunity to protect Linux KVM virtual machines by encrypting the memory of each VM with a unique key. It can also generate a signature, that attests the correct encryption.
This increases system security a lot and protects VM for memory scrape attachs from hypervisor.
With SLES15 SP1, Suse provides full support for this technology. For further information about SEV, click here .

Quarterly Updates

Starting with 15 SP1 SUSE offers quarterly updates of the installation and package media. They will be refreshed every quarter with all maintenance and security updates. SO for the setup of new systems there is always a recent and up-to-date state.

Conclusion

This is not the full list of new features, only an abstract. But nevertheless, especially the transactional update makes it effortable to upgrade to SLES15 SP1. And always think about the security improvements which come with every new release.

Cet article SLES15 SP1 – New features est apparu en premier sur Blog dbi services.

Patroni Operations – Changing Parameters

Wed, 2019-10-30 10:17

Sooner or later all of us have to change a parameter on the database. But how is this put into execution when using a Patroni cluster? Of course there are some specifics you have to consider.
This post will give you a short introduction into this topic.

When you want to change a parameter on a Patroni cluster you have several possibilities:
– Dynamic configuration in DCS. These changes are applied asynchronously to every node.
– Local configuration in patroni.yml. This will take precedence over the dynamic configuration.
– Cluster configuration using “alter system”.
– Environment configuration using local environment variables.

Change PostgreSQL parameters using patronictl 1. Change parameters, that do not need a restart

If you want to change a parameter (or more) for the whole cluster, you should use patronictl. If you want to change the initial configuration as well, you should also adjust patroni.yml.

postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl edit-config PG1

All parameters already set are shown and can be changed like in any other file using the vi commands:

postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl edit-config PG1

loop_wait: 10
maximum_lag_on_failover: 1048576
postgresql:
  parameters:
    archive_command: /bin/true
    archive_mode: 'on'
    autovacuum_max_workers: '6'
    autovacuum_vacuum_scale_factor: '0.1'
    autovacuum_vacuum_threshold: '50'
    client_min_messages: WARNING
    effective_cache_size: 512MB
    hot_standby: 'on'
    hot_standby_feedback: 'on'
    listen_addresses: '*'
    log_autovacuum_min_duration: 60s
    log_checkpoints: 'on'
    log_connections: 'on'
    log_directory: pg_log
    log_disconnections: 'on'
    log_duration: 'on'
    log_filename: postgresql-%a.log
    log_line_prefix: '%m - %l - %p - %h - %u@%d - %x'
    log_lock_waits: 'on'
    log_min_duration_statement: 30s
    log_min_error_statement: NOTICE
    log_min_messages: WARNING
    log_rotation_age: '1440'
    log_statement: ddl
    log_temp_files: '0'
    log_timezone: Europe/Zurich
    log_truncate_on_rotation: 'on'
    logging_collector: 'on'
    maintenance_work_mem: 64MB
    max_replication_slots: 10
    max_wal_senders: '20'
    port: 5432
    shared_buffers: 128MB
    shared_preload_libraries: pg_stat_statements
    wal_compression: 'off'
    wal_keep_segments: 8
    wal_level: replica
    wal_log_hints: 'on'
    work_mem: 8MB
  use_pg_rewind: true
  use_slots: true
retry_timeout: 10
ttl: 30

Once saved, you get the following:

---
+++
@@ -2,7 +2,8 @@
 maximum_lag_on_failover: 1048576
 postgresql:
   parameters:
-    archive_command: /bin/true
+    archive_command: 'test ! -f /u99/pgdata/PG1/archived_wal/%f && cp %p /u99/pgdata/PG1/archived_wal/%f'
     archive_mode: 'on'
     autovacuum_max_workers: '6'
     autovacuum_vacuum_scale_factor: '0.1'

Apply these changes? [y/N]: y
Configuration changed

When connecting to the database you will see, that the parameter is changed now. It is also changed on all the other nodes.

 postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] sq
psql (11.5)
Type "help" for help.

postgres=# show archive_command;
                                  archive_command
------------------------------------------------------------------------------------
 test ! -f /u99/pgdata/PG1/archived_wal/%f && cp %p /u99/pgdata/PG1/archived_wal/%f
(1 row)
2. Change parameters, that need a restart

How can parameters be changed that need a restart? Especially as we want to have a minimal downtime of the cluster.
First of all the parameter can be changed the same way as the parameters that do not need a restart using patronictl edit-config. Once the parameter is changed the status overview of the cluster gets a new column showing which node needs a restart.

postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl list
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB | Pending restart |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  4 |       0.0 |        *        |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  4 |       0.0 |        *        |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  4 |       0.0 |        *        |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+

Afterwards there are two possibilites.

2.1 Restart node by node

If you do not want to restart the whole cluster, you have the possibility to restart each node separatly. Keep in mind, that you have to restart the Leader Node first, otherwise the change does not take effect. It is also possible to schedule the restart of a node.

postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl restart PG1 patroni1
When should the restart take place (e.g. 2019-10-08T15:33)  [now]:
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB | Pending restart |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  4 |       0.0 |        *        |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  4 |       0.0 |        *        |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  4 |       0.0 |        *        |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
Are you sure you want to restart members patroni1? [y/N]: y
Restart if the PostgreSQL version is less than provided (e.g. 9.5.2)  []:
Success: restart on member patroni1
postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl restart PG1 patroni2
When should the restart take place (e.g. 2019-10-08T15:34)  [now]:
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB | Pending restart |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  4 |       0.0 |                 |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  4 |       0.0 |        *        |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  4 |       0.0 |        *        |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
Are you sure you want to restart members patroni2? [y/N]: y
Restart if the PostgreSQL version is less than provided (e.g. 9.5.2)  []:
Success: restart on member patroni2
postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl restart PG1 patroni3
When should the restart take place (e.g. 2019-10-08T15:34)  [now]:
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB | Pending restart |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  4 |       0.0 |                 |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  4 |       0.0 |                 |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  4 |       0.0 |        *        |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
Are you sure you want to restart members patroni3? [y/N]: y
Restart if the PostgreSQL version is less than provided (e.g. 9.5.2)  []:
Success: restart on member patroni3
postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl list
+---------+----------+----------------+--------+---------+----+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB |
+---------+----------+----------------+--------+---------+----+-----------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  4 |       0.0 |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  4 |       0.0 |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  4 |       0.0 |
+---------+----------+----------------+--------+---------+----+-----------+
2.2 Restart the whole cluster

In case you don’t want to restart node by node and you have the possibility of a downtime, it is also possible to restart the whole cluster (scheduled or immediately)

postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl restart PG1
When should the restart take place (e.g. 2019-10-08T15:37)  [now]:
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB | Pending restart |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  4 |       0.0 |        *        |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  4 |       0.0 |        *        |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  4 |       0.0 |        *        |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
Are you sure you want to restart members patroni1, patroni2, patroni3? [y/N]: y
Restart if the PostgreSQL version is less than provided (e.g. 9.5.2)  []:
Success: restart on member patroni1
Success: restart on member patroni2
Success: restart on member patroni3
postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl list
+---------+----------+----------------+--------+---------+----+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB |
+---------+----------+----------------+--------+---------+----+-----------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  4 |       0.0 |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  4 |       0.0 |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  4 |       0.0 |
+---------+----------+----------------+--------+---------+----+-----------+
Change PostgreSQL parameters using “alter system”

Of course you can change a parameter only on one node using “alter system”, too.

 postgres@patroni1:/home/postgres/ [PG1] sq
psql (11.5)
Type "help" for help.

postgres=# show archive_Command;
 archive_command
-----------------
 /bin/false
(1 row)

postgres=# alter system set archive_command='/bin/true';
ALTER SYSTEM

postgres=# select pg_reload_conf();
 pg_reload_conf
----------------
 t
(1 row)

postgres=# show archive_command;
 archive_command
-----------------
 /bin/true
(1 row)

For sure the parameter change is not automatically applied to the replicas. The parameter is only changed on that node. All the other nodes will keep the value from the DCS. So you can change the parameter using “patronictl edit-config” or with an “alter system” command on each node. But: you also have to keep in mind the order in which the parameters are applied. The “alter system” change will persist the “patronictl edit-config” command.

Conclusion

So if you consider that there are some specialities when changing parameters in a Patroni cluster, it is quite easy to change a parameter. There are some parameters that need the same value on all nodes, e.g. max_connections, max_worker_processes, wal_level. And there are as well some parameters controlled by patroni, e.g listen_addresses and port. For a more details check the Patroni documentation . And last but not least: If you change the configuration with patronictl and one node still has another configuration. Look for a postgresql.auto.conf in the PGDATA directory. Maybe there you can find the reason for different parameters on your nodes.
If you are interested in more “Patroni Operations” blogs, check also this one Patroni operations: Switchover and Failover.

Cet article Patroni Operations – Changing Parameters est apparu en premier sur Blog dbi services.

PostgreSQL 13 will come with partitioning support for pgbench

Wed, 2019-10-30 08:32

A lot of people use pgbench to benchmark a PostgreSQL instance and pgbench is also heavily used by the PostgreSQL developers. While declarative partitioning was introduced in PostgreSQL 10 there was no support for that in pgbench, even in the current version, which is PostgreSQL 12. With PostgreSQL 13, which is currently in development, this will change and pgbench will be able to create a partitioned pgbench_accounts tables you then can run your benchmark against.

Having a look at the parameters of pgbench in PostgreSQL 13, two new ones pop up:

postgres@centos8pg:/home/postgres/ [pgdev] pgbench --help
pgbench is a benchmarking tool for PostgreSQL.

Usage:
pgbench [OPTION]... [DBNAME]

Initialization options:
-i, --initialize         invokes initialization mode
-I, --init-steps=[dtgvpf]+ (default "dtgvp")
run selected initialization steps
-F, --fillfactor=NUM     set fill factor
-n, --no-vacuum          do not run VACUUM during initialization
-q, --quiet              quiet logging (one message each 5 seconds)
-s, --scale=NUM          scaling factor
--foreign-keys           create foreign key constraints between tables
--index-tablespace=TABLESPACE
create indexes in the specified tablespace
--partitions=NUM         partition pgbench_accounts in NUM parts (default: 0)
--partition-method=(range|hash)
partition pgbench_accounts with this method (default: range)
--tablespace=TABLESPACE  create tables in the specified tablespace
--unlogged-tables        create tables as unlogged tables

Options to select what to run:
-b, --builtin=NAME[@W]   add builtin script NAME weighted at W (default: 1)
(use "-b list" to list available scripts)
-f, --file=FILENAME[@W]  add script FILENAME weighted at W (default: 1)
-N, --skip-some-updates  skip updates of pgbench_tellers and pgbench_branches
(same as "-b simple-update")
-S, --select-only        perform SELECT-only transactions
(same as "-b select-only")

Benchmarking options:
-c, --client=NUM         number of concurrent database clients (default: 1)
-C, --connect            establish new connection for each transaction
-D, --define=VARNAME=VALUE
define variable for use by custom script
-j, --jobs=NUM           number of threads (default: 1)
-l, --log                write transaction times to log file
-L, --latency-limit=NUM  count transactions lasting more than NUM ms as late
-M, --protocol=simple|extended|prepared
protocol for submitting queries (default: simple)
-n, --no-vacuum          do not run VACUUM before tests
-P, --progress=NUM       show thread progress report every NUM seconds
-r, --report-latencies   report average latency per command
-R, --rate=NUM           target rate in transactions per second
-s, --scale=NUM          report this scale factor in output
-t, --transactions=NUM   number of transactions each client runs (default: 10)
-T, --time=NUM           duration of benchmark test in seconds
-v, --vacuum-all         vacuum all four standard tables before tests
--aggregate-interval=NUM aggregate data over NUM seconds
--log-prefix=PREFIX      prefix for transaction time log file
(default: "pgbench_log")
--progress-timestamp     use Unix epoch timestamps for progress
--random-seed=SEED       set random seed ("time", "rand", integer)
--sampling-rate=NUM      fraction of transactions to log (e.g., 0.01 for 1%)
--show-script=NAME       show builtin script code, then exit

Common options:
-d, --debug              print debugging output
-h, --host=HOSTNAME      database server host or socket directory
-p, --port=PORT          database server port number
-U, --username=USERNAME  connect as specified database user
-V, --version            output version information, then exit
-?, --help               show this help, then exit

Report bugs to .

That should give us partitions according to the amount of partitions and partitioning method we chose, so let’s populate a new database:

postgres@centos8pg:/home/postgres/ [pgdev] psql -c "create database pgbench" postgres
CREATE DATABASE
Time: 326.715 ms
postgres@centos8pg:/home/postgres/ [pgdev] pgbench -i -s 10 --partitions=10 --partition-method=range --foreign-keys pgbench
dropping old tables...
NOTICE:  table "pgbench_accounts" does not exist, skipping
NOTICE:  table "pgbench_branches" does not exist, skipping
NOTICE:  table "pgbench_history" does not exist, skipping
NOTICE:  table "pgbench_tellers" does not exist, skipping
creating tables...
creating 10 partitions...
generating data...
100000 of 1000000 tuples (10%) done (elapsed 0.20 s, remaining 1.78 s)
200000 of 1000000 tuples (20%) done (elapsed 0.40 s, remaining 1.62 s)
300000 of 1000000 tuples (30%) done (elapsed 0.74 s, remaining 1.73 s)
400000 of 1000000 tuples (40%) done (elapsed 1.23 s, remaining 1.85 s)
500000 of 1000000 tuples (50%) done (elapsed 1.47 s, remaining 1.47 s)
600000 of 1000000 tuples (60%) done (elapsed 1.81 s, remaining 1.21 s)
700000 of 1000000 tuples (70%) done (elapsed 2.25 s, remaining 0.97 s)
800000 of 1000000 tuples (80%) done (elapsed 2.46 s, remaining 0.62 s)
900000 of 1000000 tuples (90%) done (elapsed 2.81 s, remaining 0.31 s)
1000000 of 1000000 tuples (100%) done (elapsed 3.16 s, remaining 0.00 s)
vacuuming...
creating primary keys...
creating foreign keys...
done in 5.78 s (drop tables 0.00 s, create tables 0.07 s, generate 3.29 s, vacuum 0.84 s, primary keys 0.94 s, foreign keys 0.65 s).

The pgbench_accounts table should now be partitioned by range:

postgres@centos8pg:/home/postgres/ [pgdev] psql -c "\d+ pgbench_accounts" pgbench
Partitioned table "public.pgbench_accounts"
Column  |     Type      | Collation | Nullable | Default | Storage  | Stats target | Description
----------+---------------+-----------+----------+---------+----------+--------------+-------------
aid      | integer       |           | not null |         | plain    |              |
bid      | integer       |           |          |         | plain    |              |
abalance | integer       |           |          |         | plain    |              |
filler   | character(84) |           |          |         | extended |              |
Partition key: RANGE (aid)
Indexes:
"pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
Foreign-key constraints:
"pgbench_accounts_bid_fkey" FOREIGN KEY (bid) REFERENCES pgbench_branches(bid)
Referenced by:
TABLE "pgbench_history" CONSTRAINT "pgbench_history_aid_fkey" FOREIGN KEY (aid) REFERENCES pgbench_accounts(aid)
Partitions: pgbench_accounts_1 FOR VALUES FROM (MINVALUE) TO (100001),
pgbench_accounts_10 FOR VALUES FROM (900001) TO (MAXVALUE),
pgbench_accounts_2 FOR VALUES FROM (100001) TO (200001),
pgbench_accounts_3 FOR VALUES FROM (200001) TO (300001),
pgbench_accounts_4 FOR VALUES FROM (300001) TO (400001),
pgbench_accounts_5 FOR VALUES FROM (400001) TO (500001),
pgbench_accounts_6 FOR VALUES FROM (500001) TO (600001),
pgbench_accounts_7 FOR VALUES FROM (600001) TO (700001),
pgbench_accounts_8 FOR VALUES FROM (700001) TO (800001),
pgbench_accounts_9 FOR VALUES FROM (800001) TO (900001)

The same should work for hash partitioning:

postgres@centos8pg:/home/postgres/ [pgdev] pgbench -i -s 10 --partitions=10 --partition-method=hash --foreign-keys pgbench
dropping old tables...
creating tables...
creating 10 partitions...
generating data...
100000 of 1000000 tuples (10%) done (elapsed 0.19 s, remaining 1.69 s)
200000 of 1000000 tuples (20%) done (elapsed 0.43 s, remaining 1.71 s)
300000 of 1000000 tuples (30%) done (elapsed 0.67 s, remaining 1.55 s)
400000 of 1000000 tuples (40%) done (elapsed 1.03 s, remaining 1.54 s)
500000 of 1000000 tuples (50%) done (elapsed 1.22 s, remaining 1.22 s)
600000 of 1000000 tuples (60%) done (elapsed 1.59 s, remaining 1.06 s)
700000 of 1000000 tuples (70%) done (elapsed 1.80 s, remaining 0.77 s)
800000 of 1000000 tuples (80%) done (elapsed 2.16 s, remaining 0.54 s)
900000 of 1000000 tuples (90%) done (elapsed 2.36 s, remaining 0.26 s)
1000000 of 1000000 tuples (100%) done (elapsed 2.69 s, remaining 0.00 s)
vacuuming...
creating primary keys...
creating foreign keys...
done in 4.99 s (drop tables 0.10 s, create tables 0.08 s, generate 2.74 s, vacuum 0.84 s, primary keys 0.94 s, foreign keys 0.30 s).
postgres@centos8pg:/home/postgres/ [pgdev] psql -c "\d+ pgbench_accounts" pgbench
Partitioned table "public.pgbench_accounts"
Column  |     Type      | Collation | Nullable | Default | Storage  | Stats target | Description
----------+---------------+-----------+----------+---------+----------+--------------+-------------
aid      | integer       |           | not null |         | plain    |              |
bid      | integer       |           |          |         | plain    |              |
abalance | integer       |           |          |         | plain    |              |
filler   | character(84) |           |          |         | extended |              |
Partition key: HASH (aid)
Indexes:
"pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
Foreign-key constraints:
"pgbench_accounts_bid_fkey" FOREIGN KEY (bid) REFERENCES pgbench_branches(bid)
Referenced by:
TABLE "pgbench_history" CONSTRAINT "pgbench_history_aid_fkey" FOREIGN KEY (aid) REFERENCES pgbench_accounts(aid)
Partitions: pgbench_accounts_1 FOR VALUES WITH (modulus 10, remainder 0),
pgbench_accounts_10 FOR VALUES WITH (modulus 10, remainder 9),
pgbench_accounts_2 FOR VALUES WITH (modulus 10, remainder 1),
pgbench_accounts_3 FOR VALUES WITH (modulus 10, remainder 2),
pgbench_accounts_4 FOR VALUES WITH (modulus 10, remainder 3),
pgbench_accounts_5 FOR VALUES WITH (modulus 10, remainder 4),
pgbench_accounts_6 FOR VALUES WITH (modulus 10, remainder 5),
pgbench_accounts_7 FOR VALUES WITH (modulus 10, remainder 6),
pgbench_accounts_8 FOR VALUES WITH (modulus 10, remainder 7),
pgbench_accounts_9 FOR VALUES WITH (modulus 10, remainder 8).

Looks fine. Now you can easily benchmark against a partitioned pgbench_accounts table.

Cet article PostgreSQL 13 will come with partitioning support for pgbench est apparu en premier sur Blog dbi services.

PostgreSQL check_function_bodies, what is it good for?

Sun, 2019-10-27 03:35

One of the probably lesser known PostgreSQL parameters is check_function_bodies. If you know Oracle, then you for sure faced “invalid objects” a lot. In PostgreSQL, by default, there is nothing like an invalid object. That implies that you can not create a function or procedure which references an object that does not yet exist.

Lets assume you want to create a function like this, but the table “t1” does not exist:

postgres=# create or replace function f1 () returns setof t1 as
$$
select * from t1;
$$ language 'sql';
ERROR:  type "t1" does not exist

PostgreSQL will not create the function as a dependent objects does not exist. Once the table is there the function will be created:

postgres=# create table t1 ( a int );
CREATE TABLE
postgres=# insert into t1 values(1);
INSERT 0 1
postgres=# create or replace function f1 () returns setof t1 as
$$
select * from t1;
$$ language 'sql';
CREATE FUNCTION
postgres=# select * from f1();
a
---
1

The issue with that is, that you need to follow the order in which functions gets created. Especially when you need to load functions for other users that can easily become tricky and time consuming. This is where check_function_bodies helps:

postgres=# set check_function_bodies = false;
SET
postgres=# create or replace function f2 () returns setof t1 as
$$
select * from t2;
$$ language 'sql';
CREATE FUNCTION

The function was created although t2 did not exist. Executing the function right now of course will generate an error:

postgres=# select * from f2();
ERROR:  relation "t2" does not exist
LINE 2: select * from t2;
^
QUERY:
select * from t2;

CONTEXT:  SQL function "f2" during startup

Once the table is there all is fine:

postgres=# create table t2 ( a int );
CREATE TABLE
postgres=# insert into t2 values (2);
INSERT 0 1
postgres=# select * from f2();
a
---
2

This is very helpful when loading objects provided by an external vendor. pg_dump is doing that by default.

Cet article PostgreSQL check_function_bodies, what is it good for? est apparu en premier sur Blog dbi services.

AEM Forms – No SSLMutualAuthProvider available

Sat, 2019-10-26 15:00

In the process of setting up the AEM Workbench to use 2-way-SSL, you will need at some point to use a Hybrid Domain and a specific Authentication Provider. Depending on the version of the AEM that you are using, this Authentication Provider might not be present and therefore you will never be able to set that up properly. In this blog, I will describe what was done in our case to solve this problem.

The first time we tried to set that up (WebLogic Server 12.2, AEM 6.4.0), it just wasn’t working. Therefore, we opened a case with the Adobe Support and after quite some time, we found out that the documentation was not complete (#CQDOC-13273) and that there were actually missing steps and missing configuration inside the AEM to allow the 2-way-SSL to work. So basically everything said that the 2-way-SSL was possible but there were just missing pieces inside AEM to have it really working. Therefore after discussion & investigation with the Adobe Support Engineers (#NPR-26490), they provided us the missing piece: adobe-usermanager-ssl-dsc.jar.

When you install AEM Forms, it will automatically deploy a bunch of DSC (jar file) to provide all features of the AEM Forms. These are a few examples:

  • adobe-pdfservices-dsc.jar
  • adobe-usermanager-dsc.jar
  • adobe-jobmanager-dsc.jar
  • adobe-scheduler-weblogic-dsc.jar

Therefore, our AEM Forms version at that time (mid-2018, AEM 6.4.0) was missing one of these DSC and it was the root cause of our issue. So what can you do fix that? Well you just have deploy it and since we are anyway in the middle of working with the AEM Workbench to set it up with 2-way-SSL, that’s perfect. While the Workbench is still able to use 1-way-SSL (don’t set your Application Server in 2-way-SSL or revert it to 1-way-SSL):

  • Download or request the file “adobe-usermanager-ssl-dsc.jar” for your AEM version to the Adobe Support
  • Open the AEM Workbench (run the workbench.exe file)
  • Click on “File > Login
  • Set the Log on to to: <AEM_HOST> – SimpleAuth (or whatever the name of your SimpleAuth is)
  • Set the Username to: administrator (or whatever other account you have)
  • Set the Password for this account
  • Click on “Login
  • Click on “Window > Show View > Components
  • The Components window should be opened (if not already done before) somewhere on the screen (most probably on the left side)
  • Inside the Components window, right click on the “Components” folder and select “Install Component …
  • Find the file “adobe-usermanager-ssl-dsc.jar” that has been downloaded earlier, select it and click on “Open
  • Right click on the “Components” folder and select “Refresh
  • Expand the “Components” folder (if not already done), and look for the component named “SSLAuthProvider
  • If this component isn’t started yet (there is a red square on the package), then start it using the following steps:
    • Right click on “SSLAuthProvider
    • Select “Start Component

Note: If the “SSLAuthProvider” component already exists, then you will see an error. This is fine, it just needs to be there and to be started/running. If this is the case then it’s all good.

Workbench - Open components

Workbench - Refresh components

Workbench - Start component

Once the SSLAuthProvider DSC has been installed and is running, you should be able to see the SSLMutualAuthProvider in the list of custom providers while creating the Hybrid Domain on the AdminUI. Adobe was normally supposed to fix this in the following releases but I didn’t get the opportunity to test the installation of AEM 6.5 from scratch yet. If you have this information, don’t hesitate to share!

Cet article AEM Forms – No SSLMutualAuthProvider available est apparu en premier sur Blog dbi services.

AEM Forms – “2-way-SSL” Setup and Workbench configuration

Sat, 2019-10-26 14:45

In the past two years almost, I have been working with AEM (Adobe Experience Manager) Forms. The road taken by this project was full of problem because of security constraints that AEM has/had big trouble dealing with. In this blog, I will talk about one security aspect which brings some trouble: how to setup and use the “2-way-SSL” (I will describe below why I put that in quote) for the AEM Workbench.

I have been using AEM Forms 6.4.0 initially (20180228) with its associated Workbench version. I will consider that the AEM Forms has been installed already and is working properly. In this case, I used AEM Forms on a WebLogic Server (12.2) which I configured in HTTPS. So once you have that, what do you need to do to configure and use the AEM Workbench with “2-way-SSL”? Well first, let’s ensure that the AEM Workbench is working properly and then start with the setup.

Open the AEM Workbench and configure a new “Server”:

  • Open the AEM Workbench (run the workbench.exe file)
  • Click on “File > Login
  • Click on “Configure...”
  • Click on the “+” sign to add a new Server
    • Set the Server Title to: <AEM_HOST> – SimpleAuth
    • Set the Hostname to: <AEM_HOST>
    • Set the Protocol to: Simple Object Access Protocol (SOAP/HTTPs)
    • Set the Server Port Number to: <AEM_PORT>
    • Click on “OK
  • Click on “OK
  • Set the Log on to the newly created Server (“<AEM_HOST> – SimpleAuth“)
  • Set the Username to: administrator (or whatever other account you have)
  • Set the Password for this account
  • Click on “Login

Workbench login 1-way-SSL

If everything was done properly, the login should be working. The next step is to configure AEM for the “2-way-SSL” communications. As mentioned at the beginning of this blog, I put that in quote because it’s a 2-way-SSL but there is one security layer that is bypassed when doing that. With the AEM Workbench in 1-way-SSL, you need to enter a username and a credential. Adding a 2-way-SSL instead would normally just add another layer of security where the server and client will exchange their certificate and will trust each other but the user’s authentication is still needed!

In the case of the AEM Workbench, the “2-way-SSL” setup actually completely bypass the user’s authentication and therefore I do not really consider that as a real 2-way-SSL setup… It might even be considered as a security issue (it’s a shame for a feature that is supposed to increase security) because, as you will see below, as soon as you have the Client SSL Certificate (and its password obviously), then you will be able to access AEM Workbench. So protect this certificate with great care.

To configure the AEM, you will then need to create an Hybrid Domain:

  • Open the AEM AdminUI (https://<AEM_HOST>:<AEM_PORT>/adminui)
  • Login with the administrator account (or whatever other account you have)
  • Navigate to: Settings > User Management > Domain Management
  • Click on “New Hybrid Domain
    • Set the ID to: SSLMutualAuthProvider
    • Set the Name to: SSLMutualAuthProvider
    • Check the “Enable Account Locking” checkbox
    • Uncheck the “Enable Just In Time Provisioning” checkbox
    • Click on “Add Authentication
      • Set the “Authentication Provider” to: Custom
      • Check the “SSLMutualAuthProvider” checkbox
      • Click on “OK
    • Click on “OK

Note: If “SSLMutualAuthProvider” isn’t available on the Authentication page, then please check this blog.

Hybrid Domain 1

Hybrid Domain 2

Hybrid Domain 3

Then you will need to create a user. In this example, I will use a generic account but it is possible to have several accounts for each of your devs for example, in which case each user must have their own SSL Certificate. The user Canonical Name and ID must absolutely match the CN used to generate the SSL Certificate that the Client will use. So if you generated an SSL Certificate for the Client with “/C=CH/ST=Jura/L=Delemont/O=dbi services/OU=IT/CN=aem-dev“, then the Canonical Name and ID to be used for the user in AEM should be “aem-dev“:

  • Navigate to: Settings > User Management > Users and Groups
  • Click on “New User
  • On the New User (Step 1 of 3) screen:
    • Uncheck the “System Generated” checkbox
    • Set the Canonical Name to: <USER_CN>
    • Set the First Name to: 2-way-SSL
    • Set the Last Name to: User
    • Set the Domain to: SSLMutualAuthProvider
    • Set the User Id to: <USER_CN>
    • Click on “Next
  • On the New User: 2-way-SSL (Step 2 of 3) screen:
    • Click on “Next
  • On the New User: 2-way-SSL (Step 3 of 3) screen:
    • Click on “Find Roles
      • Check the checkbox for the Role Name: Application Administrator (or any other valid role that you want this user to be able to use)
      • Click on “OK
  • Click on “Finish

User 1

User 2

User 3

At this point, you can configure your Application Server to handle the 2-way-SSL communications. In WebLogic Server, this is done by setting the “Two Way Client Cert Behavior” to “Client Certs Requested and Enforced” in the SSL subtab of the Managed Server(s) hosting the AEM Forms applications.

Finally the last step is to get back to the AEM Workbench and try your 2-way-SSL communications. If you try again to use the SimpleAuth that we defined above, it should fail because the Application Server will require the Client SSL Certificate, which isn’t provided in this case. So let’s create a new “Server”:

  • Click on “File > Login
  • Click on “Configure...”
  • Click on the “+” sign to add a new Server
    • Set the Server Title to: <AEM_HOST> – MutualAuth
    • Set the Hostname to: <AEM_HOST>
    • Set the Protocol to: Simple Object Access Protocol (SOAP/HTTPs) Mutual Auth
    • Set the Server Port Number to: <AEM_PORT>
    • Click on “OK
  • Click on “OK
  • Set the Log on to the newly created Server (“<AEM_HOST> – MutualAuth“)
  • Set the Key Store to: file:C:\Users\Morgan\Documents\AEM_Workbench\aem-dev.jks (Adapt to wherever you put the keystore)
  • Set the Key Store Password to: <KEYSTORE_PWD>
  • Set the Trust Store to: file:C:\Users\Morgan\Documents\AEM_Workbench\trust.jks (Adapt to wherever you put the truststore)
  • Set the Trust Store Password to: <TRUSTSTORE_PWD>
  • Click on “Login

Workbench login 2-way-SSL

In the above login screen, the KeyStore is the SSL Certificate that was created for the Client and the TrustStore will be used to validate/trust the SSL Certificate of the AEM Server. It can be the cacerts from the AEM Workbench for example. If you are using a Self-Signed SSL Certificate, don’t forget to add the Trust Chain into the TrustStore.

Cet article AEM Forms – “2-way-SSL” Setup and Workbench configuration est apparu en premier sur Blog dbi services.

Adding a dbvisit standby database on the ODA in a non-OMF environment

Wed, 2019-10-23 06:04

I have recently been working on a customer project where I had been challenged adding a dbvisit standby database on an ODA X7-2M, named ODA03. The existing customer environment was composed of Oracle Standard 12.2 version database. The primary database, myDB, is running on server named srv02 and using a non-OMF configuration. On the ODA side we are working with OMF configuration. The dbvisit version available at that time was version 8. You need to know that version 9 is currently the last one and brings some new cool features. Through this blog I would like to share with you my experience, the problem I have been facing and the solution I could put in place.

Preparing the instance on the ODA

First of all I have been creating an instance only database on the ODA.

root@ODA03 ~]# odacli list-dbhomes

ID                                       Name                 DB Version                               Home Location                                 Status   
---------------------------------------- -------------------- ---------------------------------------- --------------------------------------------- ----------
ec33e32a-37d1-4d0d-8c40-b358dcf5660c     OraDB12201_home1     12.2.0.1.180717                          /u01/app/oracle/product/12.2.0.1/dbhome_1     Configured

[root@ODA03 ~]# odacli create-database -m -u myDB_03 -dn domain.name -n myDB -r ACFS -io -dh ec33e32a-37d1-4d0d-8c40-b358dcf5660c
Password for SYS,SYSTEM and PDB Admin:

Job details
----------------------------------------------------------------
                     ID:  96fd4d07-4604-4158-9c25-702c01f4493e
            Description:  Database service creation with db name: myDB
                 Status:  Created
                Created:  May 15, 2019 4:29:15 PM CEST
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------

[root@ODA03 ~]# odacli describe-job -i 96fd4d07-4604-4158-9c25-702c01f4493e

Job details
----------------------------------------------------------------
                     ID:  96fd4d07-4604-4158-9c25-702c01f4493e
            Description:  Database service creation with db name: myDB
                 Status:  Success
                Created:  May 15, 2019 4:29:15 PM CEST
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Setting up ssh equivalance               May 15, 2019 4:29:16 PM CEST        May 15, 2019 4:29:16 PM CEST        Success
Creating volume datmyDB                    May 15, 2019 4:29:16 PM CEST        May 15, 2019 4:29:38 PM CEST        Success
Creating volume reco                     May 15, 2019 4:29:38 PM CEST        May 15, 2019 4:30:00 PM CEST        Success
Creating ACFS filesystem for DATA        May 15, 2019 4:30:00 PM CEST        May 15, 2019 4:30:17 PM CEST        Success
Creating ACFS filesystem for RECO        May 15, 2019 4:30:17 PM CEST        May 15, 2019 4:30:35 PM CEST        Success
Database Service creation                May 15, 2019 4:30:35 PM CEST        May 15, 2019 4:30:51 PM CEST        Success
Auxiliary Instance Creation              May 15, 2019 4:30:35 PM CEST        May 15, 2019 4:30:47 PM CEST        Success
password file creation                   May 15, 2019 4:30:47 PM CEST        May 15, 2019 4:30:49 PM CEST        Success
archive and redo log location creation   May 15, 2019 4:30:49 PM CEST        May 15, 2019 4:30:49 PM CEST        Success
updating the Database version            May 15, 2019 4:30:49 PM CEST        May 15, 2019 4:30:51 PM CEST        Success

Next steps are really common DBA operations :

  • Create a pfile from the current primary database
  • Transfer the pfile to the ODA
  • Update the pfile as needed (path, db_unique_name, …)
  • Create a spfile from the pfile on the new ODA database
  • Apply ODA specific instance parameters
  • Copy or create the password file with same password

The parameters that are mandatory to be set on the ODA instance are the following :
*.db_create_file_dest=’/u02/app/oracle/oradata/myDB_03′
*.db_create_online_log_dest_1=’/u03/app/oracle/redo’
*.db_recovery_file_dest=’/u03/app/oracle/fast_recovery_area’

Also all the convert parameters should be removed. Using convert parameter is incompatible with OMF.

Creating the standby database Using dbvisit

I first tried to use dbvisit to create the standby database.

As usual and common dbvisit operation, I first created the DDC configuration file from the primary server :

oracle@srv02:/u01/app/dbvisit/standby/ [myDB] ./dbvctl -o setup
...
...
...
Below are the list of configuration variables provided during the setup process:

Configuration Variable             Value Provided
======================             ==============
ORACLE_SID                         myDB
ORACLE_HOME                        /opt/oracle/product/12.2.0

SOURCE                             srv02
ARCHSOURCE                         /u03/app/oracle/dbvisit_arch/myDB
RAC_DR                             N
USE_SSH                            N
DESTINATION                        ODA03
NETPORT                            7890
DBVISIT_BASE_DR                    /u01/app/dbvisit
ORACLE_HOME_DR                     /u01/app/oracle/product/12.2.0.1/dbhome_1
DB_UNIQUE_NAME_DR                  myDB_03
ARCHDEST                           /u03/app/oracle/dbvisit_arch/myDB
ORACLE_SID_DR                      myDB
ENV_FILE                           myDBSTD1

Are these variables correct?  [Yes]:
...
...
...

I then used this DDC configuration file to create the standby database :

oracle@srv02:/u01/app/dbvisit/standby/ [myDB] ./dbvctl -d myDBSTD1 --csd


-------------------------------------------------------------------------------

INIT ORA PARAMETERS
-------------------------------------------------------------------------------
*              audit_file_dest                         /u01/app/oracle/admin/myDB/adump
*              compatible                              12.2.0
*              control_management_pack_access          NONE
*              db_block_size                           8192
*              db_create_file_dest                     /u02/app/oracle/oradata/myDB_03
*              db_create_online_log_dest_1             /u03/app/oracle/redo
*              db_domain
*              db_name                                 myDB
*              db_recovery_file_dest                   /u03/app/oracle/fast_recovery_area
*              db_recovery_file_dest_size              240G
*              db_unique_name                          myDB_03
*              diagnostic_dest                         /u01/app/oracle
*              dispatchers                             (PROTOCOL=TCP) (SERVICE=myDBXDB)
*              instance_mode                           READ-WRITE
*              java_pool_size                          268435456
*              log_archive_dest_1                      LOCATION=USE_DB_RECOVERY_FILE_DEST
*              open_cursors                            3000
*              optimizer_features_enable               12.2.0.1
*              pga_aggregate_target                    4194304000
*              processes                               8000
*              remote_login_passwordfile               EXCLUSIVE
*              resource_limit                          TRUE
*              sessions                                7552
*              sga_max_size                            53687091200
*              sga_target                              26843545600
*              shared_pool_reserved_size               117440512
*              spfile                                  OS default
*              statistics_level                        TYPICAL
*              undo_retention                          300
*              undo_tablespace                         UNDOTBS1

-------------------------------------------------------------------------------

Status: VALID

What would you like to do:
   1 - Create standby database using existing saved template
   2 - View content of existing saved template
   3 - Return to the previous menu
   Please enter your choice [1]:

This operation failed with following errors :

Cannot create standby data or temp file /usr/oracle/oradata/myDB/myDB_bi_temp01.dbf for
primary file /usr/oracle/oradata/myDB/myDB_bi_temp01.dbf as location /usr/oracle/oradata/myDB
does not exist on the standby.

A per dbvisit documentation, dbvisit standby is certified ODA and fully compatible with non-OMF and OMF databases. This is correct, the only distinction is that the full environment needs to be in same configuration. That’s to say that if the primary is OMF, the standby is expected to be OMF. If the primary is running a non-OMF configuration, the standby should be using non-OMF as well.

Using RMAN

I decided to duplicate the database using RMAN and a backup that I transferred locally on the ODA. The backup was the previous nightly inc0 backup. Before running the rman duplication I executed a last archive log backup to make sure to have the most recent archive used in the duplication.

I’m taking this opportunity to highlight that, thanks to ODA NVMe technology, the duplication of the 3 TB database without multiple channel (standard edition) took a bit more than 2 hours only. On the existing servers this took about 10 hours.

I added following tns entry in the tnsnames.ora.

myDBSRV3 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = ODA03.domain.name)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = myDB)
      (UR = A)
    )
  )

Of course I could have been using a local connection.

I made sure the database to be in nomount status and ran the rman duplication :

oracle@ODA03:/opt/oracle/backup/ [myDB] rmanh

Recovery Manager: Release 12.2.0.1.0 - Production on Mon May 20 13:24:29 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

RMAN> connect auxiliary sys@myDBSRV3

auxiliary database Password:
connected to auxiliary database: myDB (not mounted)

RMAN> run {
2> duplicate target database for standby dorecover backup location '/opt/oracle/backup/myDB';
3> }

Starting Duplicate Db at 20-MAY-2019 13:25:51

contents of Memory Script:
{
   sql clone "alter system set  control_files =
  ''/u03/app/oracle/redo/myDB_03/controlfile/o1_mf_gg4qvpnn_.ctl'' comment=
 ''Set by RMAN'' scope=spfile";
   restore clone standby controlfile from  '/opt/oracle/backup/myDB/ctl_myDB_myDB_s108013_p1_newbak.ctl';
}
executing Memory Script

sql statement: alter system set  control_files =   ''/u03/app/oracle/redo/myDB_03/controlfile/o1_mf_gg4qvpnn_.ctl'' comment= ''Set by RMAN'' scope=spfile

Starting restore at 20-MAY-2019 13:25:51
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: SID=9186 device type=DISK

channel ORA_AUX_DISK_1: restoring control file
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:01
output file name=/u03/app/oracle/redo/myDB_03/controlfile/o1_mf_gg4qvpnn_.ctl
Finished restore at 20-MAY-2019 13:25:52

contents of Memory Script:
{
   sql clone 'alter database mount standby database';
}
executing Memory Script

sql statement: alter database mount standby database
released channel: ORA_AUX_DISK_1
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: SID=9186 device type=DISK

contents of Memory Script:
{
   set until scn  49713361973;
   set newname for clone tempfile  1 to new;
   set newname for clone tempfile  2 to new;
   switch clone tempfile all;
   set newname for clone datafile  1 to new;
   set newname for clone datafile  2 to new;
   set newname for clone datafile  3 to new;
   set newname for clone datafile  4 to new;
   set newname for clone datafile  5 to new;
   set newname for clone datafile  6 to new;
   set newname for clone datafile  7 to new;
   set newname for clone datafile  8 to new;
   set newname for clone datafile  10 to new;
   set newname for clone datafile  11 to new;
   set newname for clone datafile  12 to new;
   set newname for clone datafile  13 to new;
   set newname for clone datafile  14 to new;
   set newname for clone datafile  15 to new;
   set newname for clone datafile  16 to new;
   set newname for clone datafile  17 to new;
   set newname for clone datafile  18 to new;
   restore
   clone database
   ;
}
executing Memory Script

executing command: SET until clause

executing command: SET NEWNAME

executing command: SET NEWNAME

renamed tempfile 1 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_temp_%u_.tmp in control file
renamed tempfile 2 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_lx_bi_te_%u_.tmp in control file

executing command: SET NEWNAME

...
...
...

executing command: SET NEWNAME

Starting restore at 20-MAY-2019 13:25:57
using channel ORA_AUX_DISK_1

channel ORA_AUX_DISK_1: starting datafile backup set restore
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_AUX_DISK_1: restoring datafile 00001 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_system_%u_.dbf
channel ORA_AUX_DISK_1: restoring datafile 00003 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_undotbs1_%u_.dbf
channel ORA_AUX_DISK_1: restoring datafile 00005 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_lxdataid_%u_.dbf
channel ORA_AUX_DISK_1: restoring datafile 00006 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_renderz2_%u_.dbf
channel ORA_AUX_DISK_1: restoring datafile 00007 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_lx_ods_%u_.dbf
channel ORA_AUX_DISK_1: restoring datafile 00008 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_users_%u_.dbf
channel ORA_AUX_DISK_1: restoring datafile 00013 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_renderzs_%u_.dbf
channel ORA_AUX_DISK_1: restoring datafile 00015 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_lx_stagi_%u_.dbf
channel ORA_AUX_DISK_1: reading from backup piece /opt/oracle/backup/myDB/inc0_myDB_s107963_p1
...
...
...
archived log file name=/opt/oracle/backup/myDB/1_58043_987102791.dbf thread=1 sequence=58043
archived log file name=/opt/oracle/backup/myDB/1_58044_987102791.dbf thread=1 sequence=58044
archived log file name=/opt/oracle/backup/myDB/1_58045_987102791.dbf thread=1 sequence=58045
archived log file name=/opt/oracle/backup/myDB/1_58046_987102791.dbf thread=1 sequence=58046
archived log file name=/opt/oracle/backup/myDB/1_58047_987102791.dbf thread=1 sequence=58047
archived log file name=/opt/oracle/backup/myDB/1_58048_987102791.dbf thread=1 sequence=58048
archived log file name=/opt/oracle/backup/myDB/1_58049_987102791.dbf thread=1 sequence=58049
archived log file name=/opt/oracle/backup/myDB/1_58050_987102791.dbf thread=1 sequence=58050
media recovery complete, elapsed time: 00:12:40
Finished recover at 20-MAY-2019 16:06:22
Finished Duplicate Db at 20-MAY-2019 16:06:39

I could check and see that my standby database has been successfully created on the ODA :

oracle@ODA03:/u01/app/oracle/local/dmk/etc/ [myDB] myDB
********* dbi services Ltd. *********
STATUS                 : MOUNTED
DB_UNIQUE_NAME         : myDB_03
OPEN_MODE              : MOUNTED
LOG_MODE               : ARCHIVELOG
DATABASE_ROLE          : PHYSICAL STANDBY
FLASHBACK_ON           : NO
FORCE_LOGGING          : YES
CDB Enabled            : NO
*************************************

As a personal note, I really found using oracle RMAN more convenient to duplicate a database. Albeit dbvisit script and tool is really stable, I think that this will give you more flexibility.

Registering the database in the grid cluster

As next step I registered the database in the grid.

oracle@ODA03:/u01/app/oracle/product/12.2.0.1/dbhome_1/dbs/ [LX] srvctl add database -db MyDB_03 -oraclehome /u01/app/oracle/product/12.2.0.1/dbhome_1 -dbtype SINGLE -instance MyDB -domain team-w.local -spfile /u02/app/oracle/oradata/MyDB_03/dbs/spfileMyDB.ora -pwfile /u01/app/oracle/product/12.2.0.1/dbhome_1/dbs/orapwMyDB -role PHYSICAL_STANDBY -startoption MOUNT -stopoption IMMEDIATE -dbname MyDB -node ODA03 -acfspath "/u02/app/oracle/oradata/MyDB_03,/u03/app/oracle"

I stopped the database :

SQL> shutdown immediate;
ORA-01109: database not open


Database dismounted.
ORACLE instance shut down.

And started it again with the grid infrastructure :

oracle@ODA03:/u01/app/oracle/product/12.2.0.1/dbhome_1/dbs/ [MyDB] MyDB
********* dbi services Ltd. *********
STATUS          : STOPPED
*************************************

oracle@ODA03:/u01/app/oracle/product/12.2.0.1/dbhome_1/dbs/ [MyDB] srvctl status database -d MyDB_03
Instance MyDB is not running on node ODA03

oracle@ODA03:/u01/app/oracle/product/12.2.0.1/dbhome_1/dbs/ [MyDB] srvctl start database -d MyDB_03

oracle@ODA03:/u01/app/oracle/product/12.2.0.1/dbhome_1/dbs/ [MyDB] srvctl status database -d MyDB_03
Instance MyDB is running on node ODA03
dbvisit synchronization

We now have our standby database created on the ODA. We just need to synchronize it with the primary.

Run a gap report

Executing a gap report, we can see that the newly created database is running almost 4 hours behind.

oracle@srv02:/u01/app/dbvisit/standby/ [rdbms12201] ./dbvctl -d myDBSTD1 -i
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 321953)
dbvctl started on srv02: Mon May 20 16:24:35 2019
=============================================================


Dbvisit Standby log gap report for myDB thread 1 at 201905201624:
-------------------------------------------------------------
Destination database on ODA03 is at sequence: 58050.
Source database on srv02 is at log sequence: 58080.
Source database on srv02 is at archived log sequence: 58079.
Dbvisit Standby last transfer log sequence: .
Dbvisit Standby last transfer at: .

Archive log gap for thread 1:  29.
Transfer log gap for thread 1: 58079.
Standby database time lag (DAYS-HH:MI:SS): +03:39:01.


=============================================================
dbvctl ended on srv02: Mon May 20 16:24:40 2019
=============================================================
Send the archive logs from primary to the standby database

I have been shipping the last archive logs from the primary database to the newly created standby.

oracle@srv02:/u01/app/dbvisit/standby/ [rdbms12201] ./dbvctl -d myDBSTD1
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 326409)
dbvctl started on srv02: Mon May 20 16:29:14 2019
=============================================================

>>> Obtaining information from standby database (RUN_INSPECT=Y)... done
    Thread: 1 Archive log gap: 30. Transfer log gap: 58080
>>> Sending heartbeat message... skipped
>>> First time Dbvisit Standby runs, Dbvisit Standby configuration will be copied to
    ODA03...
>>> Transferring Log file(s) from myDB on srv02 to ODA03 for thread 1:

    thread 1 sequence 58051 (1_58051_987102791.dbf)
    thread 1 sequence 58052 (1_58052_987102791.dbf)
...
...
...
    thread 1 sequence 58079 (1_58079_987102791.dbf)
    thread 1 sequence 58080 (1_58080_987102791.dbf)

=============================================================
dbvctl ended on srv02: Mon May 20 16:30:50 2019
=============================================================
Apply archive logs on the standby database

Then I could finally apply the archive logs on the standby database.

oracle@ODA03:/u01/app/dbvisit/standby/ [myDB] ./dbvctl -d myDBSTD1
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 21504)
dbvctl started on ODA03: Mon May 20 16:33:42 2019
=============================================================

>>> Sending heartbeat message... skipped

>>> Applying Log file(s) from srv02 to myDB on ODA03:

    thread 1 sequence 58051 (1_58051_987102791.arc)
    thread 1 sequence 58052 (1_58052_987102791.arc)
...
...
...
    thread 1 sequence 58079 (1_58079_987102791.arc)
    thread 1 sequence 58080 (1_58080_987102791.arc)
    Last applied log(s):
    thread 1 sequence 58080

    Next SCN required for recovery 49719323442 generated at 2019-05-20:16:27:09 +02:00.
    Next required log thread 1 sequence 58081

=============================================================
dbvctl ended on ODA03: Mon May 20 16:36:52 2019
=============================================================
Run a gap report

Running a new gap report, we can see that there is no delta between the primary and the standby database.

oracle@srv02:/u01/app/dbvisit/standby/ [rdbms12201] ./dbvctl -d myDBSTD1 -i
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 335068)
dbvctl started on srv02: Mon May 20 16:37:53 2019
=============================================================


Dbvisit Standby log gap report for myDB thread 1 at 201905201637:
-------------------------------------------------------------
Destination database on ODA03 is at sequence: 58081.
Source database on srv02 is at log sequence: 58082.
Source database on srv02 is at archived log sequence: 58081.
Dbvisit Standby last transfer log sequence: 58081.
Dbvisit Standby last transfer at: 2019-05-20 16:37:36.

Archive log gap for thread 1:  0.
Transfer log gap for thread 1: 0.
Standby database time lag (DAYS-HH:MI:SS): +00:00:01.


=============================================================
dbvctl ended on srv02: Mon May 20 16:37:57 2019
=============================================================
Preparing the database for switchover

Are we done? Absolutely not. In order to be able to successfully perform a switchover, 3 main modifications are mandatory on the non-ODA Server (running non-OMF database) :

  • The future database files should be OMF
  • The online redo log should be newly created
  • The temporary file should be newly created

Otherwise you might end with unsuccessfull switchover having below errors :

>>> Starting Switchover between srv02 and ODA03

Running pre-checks       ... failed
No rollback action required

>>> Database on server srv02 is still a Primary Database
>>> Database on server ODA03 is still a Standby Database


<<<>>>
PID:40386
TRACEFILE:40386_dbvctl_switchover_myDBSTD1_201905272153.trc
SERVER:srv02
ERROR_CODE:1
Remote execution error on ODA03.

====================Remote Output start: ODA03=====================
<<<>>>
PID:92292
TRACEFILE:92292_dbvctl_f_gs_get_info_standby_myDBSTD1_201905272153.trc
SERVER:ODA03
ERROR_CODE:2146
Dbvisit Standby cannot proceed:
Cannot create standby data or temp file /usr/oracle/oradata/myDB/temp01.dbf for primary
file /usr/oracle/oradata/myDB/temp01.dbf as location /usr/oracle/oradata/myDB does not
exist on the standby.
Cannot create standby data or temp file /usr/oracle/oradata/myDB/lx_bi_temp01.dbf for
primary file /usr/oracle/oradata/myDB/lx_bi_temp01.dbf as location /usr/oracle/oradata/myDB
does not exist on the standby.
Review the following standby database parameters:
        db_create_file_dest = /u02/app/oracle/oradata/myDB_03
        db_file_name_convert =
>>>> Dbvisit Standby terminated <<<>>> Dbvisit Standby terminated <<<<
Having new OMF configuration

There is no need to convert the full database into OMF. A database can run having both file naming configuration, non-OMF and OMF. We just need to have the database working now with OMF configuration. For this we will just apply the appropriate value to the init parameter. In my case the existing primary database was storing all data and redo files in the /opt/oracle/oradata directory.

SQL> alter system set DB_CREATE_FILE_DEST='/opt/oracle/oradata' scope=both;

System wurde geändert.

SQL> alter system set DB_CREATE_ONLINE_LOG_DEST_1='/opt/oracle/oradata' scope=both;

System wurde geändert.
Refresh the online log

We will create new OMF redo log files as described below.

The current redo log configuration :

SQL> select v$log.group#, member, v$log.status from v$logfile, v$log where v$logfile.group#=v$log.group#;

    GROUP# MEMBER                                             STATUS
---------- -------------------------------------------------- ----------
        12 /opt/oracle/oradata/myDB/redo12.log                  ACTIVE
        13 /opt/oracle/oradata/myDB/redo13.log                  CURRENT
        15 /opt/oracle/oradata/myDB/redo15.log                  INACTIVE
        16 /opt/oracle/oradata/myDB/redo16.log                  INACTIVE
         1 /opt/oracle/oradata/myDB/redo1.log                   INACTIVE
         2 /opt/oracle/oradata/myDB/redo2.log                   INACTIVE
        17 /opt/oracle/oradata/myDB/redo17.log                  INACTIVE
        18 /opt/oracle/oradata/myDB/redo18.log                  INACTIVE
        19 /opt/oracle/oradata/myDB/redo19.log                  INACTIVE
        20 /opt/oracle/oradata/myDB/redo20.log                  INACTIVE
         3 /opt/oracle/oradata/myDB/redo3.log                   INACTIVE
         4 /opt/oracle/oradata/myDB/redo4.log                   INACTIVE
         5 /opt/oracle/oradata/myDB/redo5.log                   INACTIVE
         6 /opt/oracle/oradata/myDB/redo6.log                   INACTIVE
         7 /opt/oracle/oradata/myDB/redo7.log                   INACTIVE
         8 /opt/oracle/oradata/myDB/redo8.log                   ACTIVE
         9 /opt/oracle/oradata/myDB/redo9.log                   ACTIVE
        10 /opt/oracle/oradata/myDB/redo10.log                  ACTIVE
        11 /opt/oracle/oradata/myDB/redo11.log                  ACTIVE
        14 /opt/oracle/oradata/myDB/redo14.log                  INACTIVE

For all INACTIVE redo log groups, we will be able to drop the group and create it again with the OMF naming convention :

SQL> alter database drop logfile group 1;

Datenbank wurde geändert.

SQL> alter database add logfile group 1;

Datenbank wurde geändert.

In order to move to the next redo group and release the current one, we will run a switch log file :

SQL> alter system switch logfile;

System wurde geändert.

To move the ACTIVE redo log to INACTIVE we will run a checkpoint :

SQL> alter system checkpoint;

System wurde geändert.

And then drop and recreate the last INACTIVE redo groups :

SQL> alter database drop logfile group 10;

Datenbank wurde geändert.

SQL> alter database add logfile group 10;

Datenbank wurde geändert.

To finally have all our online log with OMF format :

SQL> select v$log.group#, member, v$log.status from v$logfile, v$log where v$logfile.group#=v$log.group# order by group#;

    GROUP# MEMBER                                                       STATUS
---------- ------------------------------------------------------------ ----------
         1 /opt/oracle/oradata/myDB/onlinelog/o1_mf_1_ggqx5zon_.log       INACTIVE
         2 /opt/oracle/oradata/myDB/onlinelog/o1_mf_2_ggqxjky2_.log       INACTIVE
         3 /opt/oracle/oradata/myDB/onlinelog/o1_mf_3_ggqxjodl_.log       INACTIVE
         4 /opt/oracle/oradata/myDB/onlinelog/o1_mf_4_ggqxkddc_.log       INACTIVE
         5 /opt/oracle/oradata/myDB/onlinelog/o1_mf_5_ggqxkj1t_.log       INACTIVE
         6 /opt/oracle/oradata/myDB/onlinelog/o1_mf_6_ggqxkmnm_.log       CURRENT
         7 /opt/oracle/oradata/myDB/onlinelog/o1_mf_7_ggqxn373_.log       UNUSED
         8 /opt/oracle/oradata/myDB/onlinelog/o1_mf_8_ggqxn7b3_.log       UNUSED
         9 /opt/oracle/oradata/myDB/onlinelog/o1_mf_9_ggqxnbxd_.log       UNUSED
        10 /opt/oracle/oradata/myDB/onlinelog/o1_mf_10_ggqxvlbf_.log      UNUSED
        11 /opt/oracle/oradata/myDB/onlinelog/o1_mf_11_ggqxvnyg_.log      UNUSED
        12 /opt/oracle/oradata/myDB/onlinelog/o1_mf_12_ggqxvqyp_.log      UNUSED
        13 /opt/oracle/oradata/myDB/onlinelog/o1_mf_13_ggqxvv2o_.log      UNUSED
        14 /opt/oracle/oradata/myDB/onlinelog/o1_mf_14_ggqxxcq7_.log      UNUSED
        15 /opt/oracle/oradata/myDB/onlinelog/o1_mf_15_ggqxxgfg_.log      UNUSED
        16 /opt/oracle/oradata/myDB/onlinelog/o1_mf_16_ggqxxk67_.log      UNUSED
        17 /opt/oracle/oradata/myDB/onlinelog/o1_mf_17_ggqxypwg_.log      UNUSED
        18 /opt/oracle/oradata/myDB/onlinelog/o1_mf_18_ggqy1z78_.log      UNUSED
        19 /opt/oracle/oradata/myDB/onlinelog/o1_mf_19_ggqy2270_.log      UNUSED
        20 /opt/oracle/oradata/myDB/onlinelog/o1_mf_20_ggqy26bj_.log      UNUSED

20 Zeilen ausgewählt.
Refresh temporary file

The database was using 2 temp tablespaces : TEMP and MyDB_BI_TEMP.

We first need to add new temp files in OMF format for both tablespaces.

SQL> alter tablespace TEMP add tempfile size 20G;

Tablespace wurde geändert.

SQL> alter tablespace myDB_BI_TEMP add tempfile size 20G;

Tablespace wurde geändert.

Both tablespace will now include 2 files : a previous non-OMF one and a new OMF one :

SQL> @qdbstbsinf.sql
Enter a tablespace name filter (US%): TEMP

TABLESPACE_NAME      FILE_NAME                                                    STATUS             SIZE_MB AUTOEXTENSIB MAXSIZE_MB
-------------------- ------------------------------------------------------------ --------------- ---------- ------------ ----------
TEMP                 /opt/oracle/oradata/myDB/datafile/o1_mf_temp_ggrjzm9o_.tmp     ONLINE               20480 NO                    0
TEMP                 /usr/oracle/oradata/myDB/temp01.dbf                            ONLINE               20480 NO                    0

SQL> @qdbstbsinf.sql
Enter a tablespace name filter (US%): myDB_BI_TEMP

TABLESPACE_NAME      FILE_NAME                                                    STATUS             SIZE_MB AUTOEXTENSIB MAXSIZE_MB
-------------------- ------------------------------------------------------------ --------------- ---------- ------------ ----------
myDB_BI_TEMP           /opt/oracle/oradata/myDB/datafile/o1_mf_lx_bi_te_ggrk0wxz_.tmp ONLINE               20480 NO                    0
myDB_BI_TEMP           /usr/oracle/oradata/myDB/lx_bi_temp01.dbf                      ONLINE               20480 YES                5120

Dropping temporary file will end into error :

SQL> alter database tempfile '/usr/oracle/oradata/myDB/temp01.dbf' drop including datafiles;
alter database tempfile '/usr/oracle/oradata/myDB/temp01.dbf' drop including datafiles
*
FEHLER in Zeile 1:
ORA-25152: TEMPFILE kann momentan nicht gelöscht werden

We need to restart the database. This will only be possible during the maintenance windows scheduled to run the switchover.

SQL> shutdown immediate;
Datenbank geschlossen.
Datenbank dismounted.
ORACLE-Instanz heruntergefahren.

SQL> startup
ORACLE-Instanz hochgefahren.

Total System Global Area 5,3687E+10 bytes
Fixed Size                 26330584 bytes
Variable Size            3,3152E+10 bytes
Database Buffers         2,0401E+10 bytes
Redo Buffers              107884544 bytes
Datenbank mounted.
Datenbank geöffnet.

The previous non-OMF temporary file can now be deleted :

SQL>  alter database tempfile '/usr/oracle/oradata/myDB/temp01.dbf' drop including datafiles;

Datenbank wurde geändert.

SQL> alter database tempfile '/usr/oracle/oradata/myDB/lx_bi_temp01.dbf' drop including datafiles;

Datenbank wurde geändert.

And we only have OMF temporary files now :

SQL>  @qdbstbsinf.sql
Enter a tablespace name filter (US%): TEMP

TABLESPACE_NAME      FILE_NAME                                                    STATUS             SIZE_MB AUTOEXTENSIB MAXSIZE_MB
-------------------- ------------------------------------------------------------ --------------- ---------- ------------ ----------
TEMP                 /opt/oracle/oradata/myDB/datafile/o1_mf_temp_ggrjzm9o_.tmp     ONLINE               20480 NO                    0

SQL>  @qdbstbsinf.sql
Enter a tablespace name filter (US%): myDB_BI_TEMP

TABLESPACE_NAME      FILE_NAME                                                    STATUS             SIZE_MB AUTOEXTENSIB MAXSIZE_MB
-------------------- ------------------------------------------------------------ --------------- ---------- ------------ ----------
myDB_BI_TEMP           /opt/oracle/oradata/myDB/datafile/o1_mf_lx_bi_te_ggrk0wxz_.tmp ONLINE               20480 NO                    0

Testing switchover

We are now ready to test the switchover from current srv02 primary to ODA03 server after making sure both databases are synchronized.

oracle@srv02:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD1 -o switchover
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 12196)
dbvctl started on srv02: Tue May 28 00:07:34 2019
=============================================================

>>> Starting Switchover between srv02 and ODA03

Running pre-checks       ... done
Pre processing           ... done
Processing primary       ... done
Processing standby       ... done
Converting standby       ... done
Converting primary       ... done
Completing               ... done
Synchronizing            ... done
Post processing          ... done

>>> Graceful switchover completed.
    Primary Database Server: ODA03
    Standby Database Server: srv02

>>> Dbvisit Standby can be run as per normal:
    dbvctl -d MyDBSTD1


PID:12196
TRACE:12196_dbvctl_switchover_MyDBSTD1_201905280007.trc

=============================================================
dbvctl ended on srv02: Tue May 28 00:13:31 2019
=============================================================
Conclusion

With dbvisit standby it is possible to mix non-OMF and OMF databases after completing several manual steps. The final recommendation would be to run a unique configuration. This is why, after having run a switchover to the new ODA03 database, and making sure the new database is stable, we created from scratch the old primary srv02 database with OMF configuration. Converting a database to OMF using move option is not possible with standard edition.

Cet article Adding a dbvisit standby database on the ODA in a non-OMF environment est apparu en premier sur Blog dbi services.

Having multiple standby databases and cascading with dbvisit

Wed, 2019-10-23 05:47

Dbvisit standy is a disaster recovery solution that you will be able to use with Oracle standard edition. I have been working on a customer project where I had to setup a system having one primary and two standby databases. One of the standby database had to run with a gap of 24 hours. Knowing that flashback possibilities are very limited on standard edition, this would give customer the ability to extract and restore some data been wrongly lost following human errors.

The initial configuration would be the following one :

Database instance, db_name : MyDB
MyDB_02 (db_unique_name) primary database running on srv02 server.
MyDB_01 (db_unique_name) expected standby database running on srv01 server.
MyDB_03 (db_unique_name) expected standby database running on srv03 server.

The following DDC configuration file will be used :
MyDBSTD1 : Configuration file for first standby been synchronized every 10 minutes.
MyDBSTD2 : Configuration file for second standby been synchronized every 24 hours.

Let me walk you through the steps to setup such configuration. This article is not intended to show the whole process of implementing a dbvisit solution, but only the steps required to work with multiple standby. We will also talk about how we can implement cascaded standby and apply lag delay within dbvisit.

Recommendations

In order to limit the manual configuration changes in the DDC file after a switchover, it is recommended to use as much as possible same ORACLE_HOME, ARCHIVE Destination and DBVISIT home directory.

Creating MyDBSTD1 DDC configuration file

The first standby configuration file will be created and used between MyDB_03 (srv03) and MyDB_02 (srv02).

oracle@srv02:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -o setup


=========================================================

     Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd)
           http://www.dbvisit.com

=========================================================

=>dbvctl only needs to be run on the primary server.

Is this the primary server?  [Yes]:
The following Dbvisit Database configuration (DDC) file(s) found on this
server:

     DDC
     ===
1)   Create New DDC
2)   Cancel

Please enter choice [] : 1

Is this correct?  [Yes]:

...
...
...

Below are the list of configuration variables provided during the setup process:

Configuration Variable             Value Provided
======================             ==============
ORACLE_SID                         MyDB
ORACLE_HOME                        /opt/oracle/product/12.2.0

SOURCE                             srv02
ARCHSOURCE                         /u03/app/oracle/dbvisit_arch/MyDB
RAC_DR                             N
USE_SSH                            N
DESTINATION                        srv03
NETPORT                            7890
DBVISIT_BASE_DR                    /u01/app/dbvisit
ORACLE_HOME_DR                     /u01/app/oracle/product/12.2.0.1/dbhome_1
DB_UNIQUE_NAME_DR                  MyDB_03
ARCHDEST                           /u03/app/oracle/dbvisit_arch/MyDB
ORACLE_SID_DR                      MyDB
ENV_FILE                           MyDBSTD1

Are these variables correct?  [Yes]:

>>> Dbvisit Database configuration (DDC) file MyDBSTD1 created.

>>> Dbvisit Database repository (DDR) MyDB created.
   Repository Version          8.4
   Software Version            8.4
   Repository Status           VALID


Do you want to enter license key for the newly created Dbvisit Database configuration (DDC) file?  [Yes]:

Enter license key and press Enter: []: XXXXXXXXXXXXXXXXXXXXXXXXXXX
>>> Dbvisit Standby License
License Key     : XXXXXXXXXXXXXXXXXXXXXXXXXXX
customer_number : XXXXXX
dbname          : MyDB
expiry_date     : 2099-05-06
product_id      : 8
sequence        : 1
status          : VALID
updated         : YES

PID:423545
TRACE:dbvisit_install.log
Synchronizing both MyDB_02 and MyDB_03 Shippping logs from primary to standby
oracle@srv02:/u01/app/dbvisit/standby/ [rdbms12201] ./dbvctl -d MyDBSTD1
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 326409)
dbvctl started on srv02: Mon May 20 16:29:14 2019
=============================================================

>>> Obtaining information from standby database (RUN_INSPECT=Y)... done
    Thread: 1 Archive log gap: 30. Transfer log gap: 58080
>>> Sending heartbeat message... skipped
>>> First time Dbvisit Standby runs, Dbvisit Standby configuration will be copied to
    srv03...
>>> Transferring Log file(s) from MyDB on srv02 to srv03 for thread 1:

    thread 1 sequence 58051 (1_58051_987102791.dbf)
    thread 1 sequence 58052 (1_58052_987102791.dbf)
...
...
...
    thread 1 sequence 58079 (1_58079_987102791.dbf)
    thread 1 sequence 58080 (1_58080_987102791.dbf)

=============================================================
dbvctl ended on srv02: Mon May 20 16:30:50 2019
=============================================================
Applying log on standby database
oracle@srv03:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD1
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 21504)
dbvctl started on srv03: Mon May 20 16:33:42 2019
=============================================================

>>> Sending heartbeat message... skipped

>>> Applying Log file(s) from srv02 to MyDB on srv03:

    thread 1 sequence 58051 (1_58051_987102791.arc)
    thread 1 sequence 58052 (1_58052_987102791.arc)
...
...
...
    thread 1 sequence 58079 (1_58079_987102791.arc)
    thread 1 sequence 58080 (1_58080_987102791.arc)
    Last applied log(s):
    thread 1 sequence 58080

    Next SCN required for recovery 49719323442 generated at 2019-05-20:16:27:09 +02:00.
    Next required log thread 1 sequence 58081

=============================================================
dbvctl ended on srv03: Mon May 20 16:36:52 2019
=============================================================
Running a gap report
oracle@srv02:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD1 -i
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 335068)
dbvctl started on srv02: Mon May 20 16:37:53 2019
=============================================================


Dbvisit Standby log gap report for MyDB thread 1 at 201905201637:
-------------------------------------------------------------
Destination database on srv03 is at sequence: 58081.
Source database on srv02 is at log sequence: 58082.
Source database on srv02 is at archived log sequence: 58081.
Dbvisit Standby last transfer log sequence: 58081.
Dbvisit Standby last transfer at: 2019-05-20 16:37:36.

Archive log gap for thread 1:  0.
Transfer log gap for thread 1: 0.
Standby database time lag (DAYS-HH:MI:SS): +00:00:01.


=============================================================
dbvctl ended on srv02: Mon May 20 16:37:57 2019
=============================================================
Switchover to srv03

At that time in the project we did a switchover to the newly created srv03 in order to test its stability. The switchover has been performed as described below, but this step is not mandatory when implementing several standby databases. As best practices, we will always test the first configuration by running a switchover before moving forward.

oracle@srv02:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD1 -o switchover
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 12196)
dbvctl started on srv02: Tue May 28 00:07:34 2019
=============================================================

>>> Starting Switchover between srv02 and srv03

Running pre-checks       ... done
Pre processing           ... done
Processing primary       ... done
Processing standby       ... done
Converting standby       ... done
Converting primary       ... done
Completing               ... done
Synchronizing            ... done
Post processing          ... done

>>> Graceful switchover completed.
    Primary Database Server: srv03
    Standby Database Server: srv02

>>> Dbvisit Standby can be run as per normal:
    dbvctl -d MyDBSTD1


PID:12196
TRACE:12196_dbvctl_switchover_MyDBSTD1_201905280007.trc

=============================================================
dbvctl ended on srv02: Tue May 28 00:13:31 2019
=============================================================

srv03 is now the new primary and srv02 a new standby database.

Creating MyDBSTD2 DDC configuration file

Once myDB_01 standby database is up and running, we can create its related DDC configuration file. To do so, we simply copy previous DDC configuration file, MyDBSTD1, and update it as needed.

I first transferred the file from current primary srv03 to new standby server srv01 :

oracle@srv03:/u01/app/dbvisit/standby/conf/ [MyDB] scp dbv_MyDBSTD1.env oracle@srv01:$PWD
dbv_MyDBSTD1.env		100% 	23KB 	22.7KB/s 		00:00

I copied it into the new DDC configuration file name :

oracle@srv01:/u01/app/dbvisit/standby/conf/ [MyDB] cp dbv_MyDBSTD1.env dbv_MyDBSTD2.env

I updated new DDC configuration accordingly to have :

  • DESTINATION as srv01 instead of srv02
  • DB_UNIQUE_NAME_DR as MyDB_01 instead of MyDB_02
  • MAILCFG to see the alerts coming from STD2 configuration.

The primary will remain the same : srv03.

oracle@srv01:/u01/app/dbvisit/standby/conf/ [MyDB] vi dbv_MyDBSTD2.env

oracle@srv01:/u01/app/dbvisit/standby/conf/ [MyDB] diff dbv_MyDBSTD1.env dbv_MyDBSTD2.env
86c86
DESTINATION = srv02
---
DESTINATION = srv01
93c93
DB_UNIQUE_NAME_DR = MyDB
---
DB_UNIQUE_NAME_DR = MyDB_01
135,136c135,136
MAILCFG_FROM = dbvisit_conf_1@domain.name MAILCFG_FROM_DR = dbvisit_conf_1@domain.name
---
MAILCFG_FROM = dbvisit_conf_2@domain.name
MAILCFG_FROM_DR = dbvisit_conf_2@domain.name

In case the ORACLE_HOME and ARCHIVE destination are not the same, these parameters will have to be updated as well.

Synchronizing both MyDB_03 and MyDB_01 Shippping logs from primary to standby
oracle@srv03:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD2
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 25914)
dbvctl started on srv03: Wed Jun  5 20:32:09 2019
=============================================================

>>> Obtaining information from standby database (RUN_INSPECT=Y)... done
    Thread: 1 Archive log gap: 383. Transfer log gap: 67385
>>> Sending heartbeat message... done
>>> First time Dbvisit Standby runs, Dbvisit Standby configuration will be copied to
    srv01...
>>> Transferring Log file(s) from MyDB on srv03 to srv01 for thread 1:

    thread 1 sequence 67003 (o1_mf_1_67003_ghgwj0z2_.arc)
    thread 1 sequence 67004 (o1_mf_1_67004_ghgwmj1w_.arc)
...
...
...
    thread 1 sequence 67384 (o1_mf_1_67384_ghj2fbgj_.arc)
    thread 1 sequence 67385 (o1_mf_1_67385_ghj2g883_.arc)

=============================================================
dbvctl ended on srv03: Wed Jun  5 20:42:05 2019
=============================================================
Applying log on standby database
oracle@srv01:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD2
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 69764)
dbvctl started on srv01: Wed Jun  5 20:42:45 2019
=============================================================

>>> Sending heartbeat message... done

>>> Applying Log file(s) from srv03 to MyDB on srv01:

    thread 1 sequence 67003 (1_67003_987102791.arc)
    thread 1 sequence 67004 (1_67004_987102791.arc)
...
...
...
    thread 1 sequence 67384 (1_67384_987102791.arc)
    thread 1 sequence 67385 (1_67385_987102791.arc)
    Last applied log(s):
    thread 1 sequence 67385

    Next SCN required for recovery 50112484332 generated at 2019-06-05:20:28:24 +02:00.
    Next required log thread 1 sequence 67386

>>> Dbvisit Archive Management Module (AMM)

    Config: number of archives to keep      = 0
    Config: number of days to keep archives = 3
    Config: diskspace full threshold        = 80%
==========

Processing /u03/app/oracle/dbvisit_arch/MyDB...
    Archive log dir: /u03/app/oracle/dbvisit_arch/MyDB
    Total number of archive files   : 383
    Number of archive logs deleted = 0
    Current Disk percent full       : 8%

=============================================================
dbvctl ended on srv01: Wed Jun  5 21:16:30 2019
=============================================================
Running a gap report
oracle@srv03:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD2 -i
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 44143)
dbvctl started on srv03: Wed Jun  5 21:17:03 2019
=============================================================


Dbvisit Standby log gap report for MyDB_03 thread 1 at 201906052117:
-------------------------------------------------------------
Destination database on srv01 is at sequence: 67385.
Source database on srv03 is at log sequence: 67387.
Source database on srv03 is at archived log sequence: 67386.
Dbvisit Standby last transfer log sequence: 67385.
Dbvisit Standby last transfer at: 2019-06-05 20:42:05.

Archive log gap for thread 1:  1.
Transfer log gap for thread 1: 1.
Standby database time lag (DAYS-HH:MI:SS): +00:48:41.
Switchover to srv01

Now we are having both srv01 and srv02 standby databases up and running and connected with current srv03 primary database. Let’s switchover to srv01 and see what would be the required steps. After each switchover the other standby DDC configuration files will have to be manually updated.

Checking srv03 and srv02 are synchronized

Both srv03 and srv02 databases should be in sync otherwise ship and apply archive logs.

oracle@srv03:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD1 -i
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 93307)
dbvctl started on srv03: Wed Jun  5 21:27:02 2019
=============================================================


Dbvisit Standby log gap report for MyDB_03 thread 1 at 201906052127:
-------------------------------------------------------------
Destination database on srv02 is at sequence: 67386.
Source database on srv03 is at log sequence: 67387.
Source database on srv03 is at archived log sequence: 67386.
Dbvisit Standby last transfer log sequence: 67386.
Dbvisit Standby last transfer at: 2019-06-05 21:24:47.

Archive log gap for thread 1:  0.
Transfer log gap for thread 1: 0.
Standby database time lag (DAYS-HH:MI:SS): +00:27:02.


=============================================================
dbvctl ended on srvxdb03: Wed Jun  5 21:27:08 2019
=============================================================
Checking srv03 and srv01 are synchronized

Both srv03 and srv01 databases should be in sync otherwise ship and apply archive logs.

oracle@srv03:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD2 -i
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 90871)
dbvctl started on srv03: Wed Jun  5 21:26:31 2019
=============================================================


Dbvisit Standby log gap report for MyDB_03 thread 1 at 201906052126:
-------------------------------------------------------------
Destination database on srv01 is at sequence: 67386.
Source database on srv03 is at log sequence: 67387.
Source database on srv03 is at archived log sequence: 67386.
Dbvisit Standby last transfer log sequence: 67386.
Dbvisit Standby last transfer at: 2019-06-05 21:26:02.

Archive log gap for thread 1:  0.
Transfer log gap for thread 1: 0.
Standby database time lag (DAYS-HH:MI:SS): +00:26:02.
Switchover to srv01
oracle@srv03:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD2 -o switchover
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 20334)
dbvctl started on srv03: Wed Jun  5 21:31:56 2019
=============================================================

>>> Starting Switchover between srv03 and srv01

Running pre-checks       ... done
Pre processing           ... done
Processing primary       ... done
Processing standby       ... done
Converting standby       ... done
Converting primary       ... done
Completing               ... done
Synchronizing            ... done
Post processing          ... done

>>> Graceful switchover completed.
    Primary Database Server: srv01
    Standby Database Server: srv03

>>> Dbvisit Standby can be run as per normal:
    dbvctl -d MyDBSTD2


PID:20334
TRACE:20334_dbvctl_switchover_MyDBSTD2_201906052131.trc

=============================================================
dbvctl ended on srv03: Wed Jun  5 21:37:40 2019
=============================================================
Attach srv02 to srv01 (new primary)

Previously to the switchover :

  • srv03 and srv01 was using MyDBSTD2 DDC configuration file
  • srv03 and srv02 was using MyDBSTD1 DDC configuration file

srv02 standby database needs now to be attach to new primary srv01. For this we will copy the MyDBSTD1 DDC configuration file from srv02 to srv01 as it is the first time srv01 is primary. Otherwise, we would only need to update accordingly the already existing file.

I have been transferring the DDC file :

oracle@srv02:/u01/app/dbvisit/standby/conf/ [MyDB] scp dbv_MyDBSTD1.env oracle@srv01:$PWD
dbv_MyDBSTD1.env    100%   23KB  14.8MB/s   00:00

MyDBSTD1 configuration file has been updated accordingly to reflect new changes and configuration :

  • SOURCE needs to be replaced from srv03 to srv01
  • DESTINATION will remain srv02
  • DB_UNIQUE_NAME needs to be replaced fromMyDB_03 to MyDB_01
  • DB_UNIQUE_NAME_DR will remain MyDB_02
oracle@srv01:/u01/app/dbvisit/standby/conf/ [MyDB] vi dbv_MyDBSTD1.env

oracle@srv01:/u01/app/dbvisit/standby/conf/ [MyDB] grep ^SOURCE dbv_MyDBSTD1.env
SOURCE = srv01

oracle@srv01:/u01/app/dbvisit/standby/conf/ [MyDB] grep DB_UNIQUE_NAME dbv_MyDBSTD1.env
# DB_UNIQUE_NAME      - Primary database db_unique_name
DB_UNIQUE_NAME = MyDB_01
# DB_UNIQUE_NAME_DR   - Standby database db_unique_name
DB_UNIQUE_NAME_DR = MyDB_02
Checking that databases are all synchronized

After performing several switch logfile on the primary in order to generate archive logs, I transferred and applied needed archive log files on both srv02 and srv03 standby databases. I made sure both are synchronized.

srv01 and srv03 databases :

oracle@srv01:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD2 -i
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 98156)
dbvctl started on srv01: Wed Jun  5 21:52:08 2019
=============================================================


Dbvisit Standby log gap report for MyDB_01 thread 1 at 201906052152:
-------------------------------------------------------------
Destination database on srv03 is at sequence: 67413.
Source database on srv01 is at log sequence: 67414.
Source database on srv01 is at archived log sequence: 67413.
Dbvisit Standby last transfer log sequence: 67413.
Dbvisit Standby last transfer at: 2019-06-05 21:51:13.

Archive log gap for thread 1:  0.
Transfer log gap for thread 1: 0.
Standby database time lag (DAYS-HH:MI:SS): +00:00:00.


=============================================================
dbvctl ended on srv01: Wed Jun  5 21:52:18 2019
=============================================================

srv01 and srv02 databases :

oracle@srv01:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD1 -i
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 100393)
dbvctl started on srv01: Wed Jun  5 21:56:06 2019
=============================================================


Dbvisit Standby log gap report for MyDB_01 thread 1 at 201906052156:
-------------------------------------------------------------
Destination database on srv02 is at sequence: 67413.
Source database on srv01 is at log sequence: 67414.
Source database on srv01 is at archived log sequence: 67413.
Dbvisit Standby last transfer log sequence: 67413.
Dbvisit Standby last transfer at: 2019-06-05 21:55:22.

Archive log gap for thread 1:  0.
Transfer log gap for thread 1: 0.
Standby database time lag (DAYS-HH:MI:SS): +00:05:13.


=============================================================
dbvctl ended on srv01: Wed Jun  5 21:56:07 2019
=============================================================
Apply delay lag

MyDBSTD2 configuration should at the end have an apply lag of 24 hours. This can be achieved using APPLY_DELAY_LAG_MINUTES in the configuration. In order to test it, I have decided with customer to use 60 minutes delay.

Update MyDBSTD2 DDC configuration file

Following parameters have been updated in the configuration :
APPLY_DELAY_LAG_MINUTES = 60
DMN_MONITOR_INTERVAL_DR = 0
TRANSFER_LOG_GAP_THRESHOLD = 0
ARCHIVE_LOG_GAP_THRESHOLD = 60

APPLY_DELAY_LAG_MINUTES is the delay in minutes to take in account before applying the vector changes.
DMN_MONITOR_INTERVAL_DR is the interval in sec for log monitor schedule on destination. 0 mean deactivated.
TRANSFER_LOG_GAP_THRESHOLD is the difference allowed between the last archived sequence on the primary and the last sequence transferred to the standby server.
ARCHIVE_LOG_GAP_THRESHOLD is the difference allowed between the last archived sequence on the primary and the last applied sequence on the standby database before an alert is sent.

oracle@srv03:/u01/app/dbvisit/standby/conf/ [MyDB] cp dbv_MyDBSTD2.env dbv_MyDBSTD2.env.201906131343

oracle@srv03:/u01/app/dbvisit/standby/conf/ [MyDB] vi dbv_MyDBSTD2.env

oracle@srv03:/u01/app/dbvisit/standby/conf/ [MyDB] diff dbv_MyDBSTD2.env dbv_MyDBSTD2.env.201906131343
281c281
DMN_MONITOR_INTERVAL_DR = 0
---
DMN_MONITOR_INTERVAL_DR = 5
331c331
APPLY_DELAY_LAG_MINUTES = 60
---
APPLY_DELAY_LAG_MINUTES = 0
374c374
ARCHIVE_LOG_GAP_THRESHOLD = 60
---
ARCHIVE_LOG_GAP_THRESHOLD = 0

oracle@srv03:/u01/app/dbvisit/standby/conf/ [MyDB] grep ^TRANSFER_LOG_GAP_THRESHOLD dbv_MyDBSTD2.env
TRANSFER_LOG_GAP_THRESHOLD = 0
Report displayed with an apply delay lag been configured

When generating a report, we can see that there is no gap in the log transfer as the archive log would be transferred through the crontab every 10 minutes. On the other side, we can see that there is an expected delay of 60 minutes in applying the logs.

oracle@srv03:/u01/app/dbvisit/standby/ [MyDBTEST] ./dbvctl -d MyDBSTD2 -i
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 66003)
dbvctl started on srv03: Thu Jun 13 15:21:29 2019
=============================================================


Dbvisit Standby log gap report for MyDB_03 thread 1 at 201906131521:
-------------------------------------------------------------
Destination database on srv01 is at sequence: 73856.
Source database on srv03 is at log sequence: 73890.
Source database on srv03 is at archived log sequence: 73889.
Dbvisit Standby last transfer log sequence: 73889.
Dbvisit Standby last transfer at: 2019-06-13 15:20:15.

Archive log gap for thread 1:  33 (apply_delay_lag_minutes=60).
Transfer log gap for thread 1: 0.
Standby database time lag (DAYS-HH:MI:SS): +01:00:00.


=============================================================
dbvctl ended on srv03: Thu Jun 13 15:21:35 2019
=============================================================
Cascading standby database

What about cascading standby database? Cascading standby database is possible with dbvisit. We would be using a cascaded standby for a reporting server that needs to be updated less frequently or if we would like to unload the primary database in sending archive logs to multiple standby databases. The cascaded standby database will remain updated through the first standby. Cascading is possible since dbvisit version 8.

Following needs to be known :

  • Switchover will not be possible between the primary and the cascaded standby database.
  • The DDC configuration file between the first standby and the cascaded standby needs to have :
    • As SOURCE the first standby database
    • CASCADE parameter set to Y. This will be done automatically when creating the DDC configuration with dbvctl -o setup. From the traces you will see : >>> Source database is a standby database. CASCADE flag will be turned on.
    • ARCHDEST and ARCHSOURCE location on the first standby needs to have same values.

    The principle is then exactly the same, and running dbvctl -d from the first standby will ship the archive log to the second standby.

I had been running some tests in my lab.

Environment

DBVP is the primary server.
DBVS is the first standby server.
DBVS2 is the second cascaded server.

oracle@DBVP:/u01/app/dbvisit/standby/ [DBVPDB] DBVPDB
********* dbi services Ltd. *********
STATUS                 : OPEN
DB_UNIQUE_NAME         : DBVPDB_SITE1
OPEN_MODE              : READ WRITE
LOG_MODE               : ARCHIVELOG
DATABASE_ROLE          : PRIMARY
FLASHBACK_ON           : NO
FORCE_LOGGING          : YES
VERSION                : 12.2.0.1.0
CDB Enabled            : NO
*************************************

oracle@DBVS:/u01/app/dbvisit/standby/ [DBVPDB] DBVPDB
********* dbi services Ltd. *********
STATUS                 : MOUNTED
DB_UNIQUE_NAME         : DBVPDB_SITE2
OPEN_MODE              : MOUNTED
LOG_MODE               : ARCHIVELOG
DATABASE_ROLE          : PHYSICAL STANDBY
FLASHBACK_ON           : NO
FORCE_LOGGING          : YES
CDB Enabled            : NO
*************************************


oracle@DBVS2:/u01/app/dbvisit/standby/ [DBVPDB] DBVPDB
********* dbi services Ltd. *********
STATUS                 : MOUNTED
DB_UNIQUE_NAME         : DBVPDB_SITE3
OPEN_MODE              : MOUNTED
LOG_MODE               : ARCHIVELOG
DATABASE_ROLE          : PHYSICAL STANDBY
FLASHBACK_ON           : NO
FORCE_LOGGING          : YES
CDB Enabled            : NO
*************************************
Create cascaded DDC configuration file

The DDC configuration file will be created from the first standby node.
DBVS (first standby server) will be the SOURCE.
DBVS2 (cascaded standby server) will be the DESTINATION.

oracle@DBVS:/u01/app/dbvisit/standby/ [DBVPDB] ./dbvctl -o setup


=========================================================

     Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b)
           http://www.dbvisit.com

=========================================================

=>dbvctl only needs to be run on the primary server.

Is this the primary server?  [Yes]:
The following Dbvisit Database configuration (DDC) file(s) found on this
server:

     DDC
     ===
1)   Create New DDC
2)   DBVPDB
3)   DBVPDB_SITE1
4)   DBVPOMF_SITE1
5)   Cancel

Please enter choice [] : 1

Is this correct?  [Yes]:

...


Continue ?  [No]: yes

=========================================================
Dbvisit Standby setup begins.
=========================================================
The following Oracle instance(s) have been found on this server:

     SID            ORACLE_HOME
     ===            ===========
1)   rdbms12201     /u01/app/oracle/product/12.2.0/dbhome_1
2)   DBVPDB         /u01/app/oracle/product/12.2.0/dbhome_1
3)   DBVPOMF        /u01/app/oracle/product/12.2.0/dbhome_1
4)   DUP            /u01/app/oracle/product/12.2.0/dbhome_1
5)   Enter own ORACLE_SID and ORACLE_HOME
Please enter choice [] : 2

Is this correct?  [Yes]:
=>ORACLE_SID will be: DBVPDB
=>ORACLE_HOME will be: /u01/app/oracle/product/12.2.0/dbhome_1

>>> Source database is a standby database. CASCADE flag will be turned on.

Yes to continue or No to cancel setup?  [Yes]:

...
...
...

Below are the list of configuration variables provided during the setup process:

Configuration Variable             Value Provided
======================             ==============
ORACLE_SID                         DBVPDB
ORACLE_HOME                        /u01/app/oracle/product/12.2.0/dbhome_1

SOURCE                             DBVS
ARCHSOURCE                         /u90/dbvisit_arch/DBVPDB_SITE2
RAC_DR                             N
USE_SSH                            Y
DESTINATION                        DBVS2
NETPORT                            22
DBVISIT_BASE_DR                    /oracle/u01/app/dbvisit
ORACLE_HOME_DR                     /u01/app/oracle/product/12.2.0/dbhome_1
DB_UNIQUE_NAME_DR                  DBVPDB_SITE3
ARCHDEST                           /u90/dbvisit_arch/DBVPDB_SITE3
ORACLE_SID_DR                      DBVPDB
ENV_FILE                           DBVPDB_CASCADED

Are these variables correct?  [Yes]:

>>> Dbvisit Database configuration (DDC) file DBVPDB_CASCADED created.

>>> Dbvisit Database repository (DDR) already installed.
   Repository Version          8.3
   Software Version            8.3
   Repository Status           VALID


Do you want to enter license key for the newly created Dbvisit Database configuration (DDC) file?  [Yes]:

Enter license key and press Enter: []: 4jo6z-8aaai-u09b6-ijjxe-cxks5-1114a-ozfvp
oracle@dbvs2's password:
>>> Dbvisit Standby License
License Key     : 4jo6z-8aaai-u09b6-ijjxe-cxks5-1114a-ozfvp
customer_number : 1
dbname          :
expiry_date     : 2019-05-29
product_id      : 8
sequence        : 1
status          : VALID
updated         : YES

PID:25571
TRACE:dbvisit_install.log

dbvisit software could see that the SOURCE is already a standby database. The software will then automatically configured the CASCADE flag to Y.

>>> Source database is a standby database. CASCADE flag will be turned on.
oracle@DBVS:/u01/app/dbvisit/standby/conf/ [DBVPDB] grep CASCADE dbv_DBVPDB_CASCADED.env
# Variable: CASCADE
#      CASCADE = Y
CASCADE = Y
Synchronize first standby with primary Ship archive log from primary to first standby
oracle@DBVP:/u01/app/dbvisit/standby/ [DBVPDB] ./dbvctl -d DBVPDB
=============================================================
Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b) (pid 23506)
dbvctl started on DBVP: Wed May 15 01:24:55 2019
=============================================================

>>> Obtaining information from standby database (RUN_INSPECT=Y)... done
    Thread: 1 Archive log gap: 3. Transfer log gap: 3
>>> Transferring Log file(s) from DBVPDB on DBVP to DBVS for thread 1:

    thread 1 sequence 50 (o1_mf_1_50_gfpmk7sg_.arc)
    thread 1 sequence 51 (o1_mf_1_51_gfpmkc7p_.arc)
    thread 1 sequence 52 (o1_mf_1_52_gfpmkf7w_.arc)

=============================================================
dbvctl ended on DBVP: Wed May 15 01:25:06 2019
=============================================================
Apply archive log on first standby
oracle@DBVS:/u01/app/dbvisit/standby/ [DBVPDB] ./dbvctl -d DBVPDB
=============================================================
Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b) (pid 27769)
dbvctl started on DBVS: Wed May 15 01:25:25 2019
=============================================================


>>> Applying Log file(s) from DBVP to DBVPDB on DBVS:

>>> No new logs to apply.
    Last applied log(s):
    thread 1 sequence 52

    Next SCN required for recovery 885547 generated at 2019-05-15:01:24:29 +02:00.
    Next required log thread 1 sequence 53

=============================================================
dbvctl ended on DBVS: Wed May 15 01:25:27 2019
=============================================================
Run a gap report
oracle@DBVP:/u01/app/dbvisit/standby/ [DBVPDB] ./dbvctl -d DBVPDB -i
=============================================================
Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b) (pid 23625)
dbvctl started on DBVP: Wed May 15 01:25:55 2019
=============================================================


Dbvisit Standby log gap report for DBVPDB_SITE1 thread 1 at 201905150125:
-------------------------------------------------------------
Destination database on DBVS is at sequence: 52.
Source database on DBVP is at log sequence: 53.
Source database on DBVP is at archived log sequence: 52.
Dbvisit Standby last transfer log sequence: 52.
Dbvisit Standby last transfer at: 2019-05-15 01:25:06.

Archive log gap for thread 1:  0.
Transfer log gap for thread 1: 0.
Standby database time lag (DAYS-HH:MI:SS): +00:00:33.


=============================================================
dbvctl ended on DBVP: Wed May 15 01:25:58 2019
=============================================================
Synchronize cascaded standby with first standby Ship archive log from first standby to cascaded standby
oracle@DBVS:/u01/app/dbvisit/standby/ [DBVPDB] ./dbvctl -d DBVPDB_CASCADED
=============================================================
Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b) (pid 27965)
dbvctl started on DBVS: Wed May 15 01:26:41 2019
=============================================================

>>> Obtaining information from standby database (RUN_INSPECT=Y)... done
    Thread: 1 Archive log gap: 3. Transfer log gap: 3
>>> Transferring Log file(s) from DBVPDB on DBVS to DBVS2 for thread 1:

    thread 1 sequence 50 (1_50_979494498.arc)
    thread 1 sequence 51 (1_51_979494498.arc)
    thread 1 sequence 52 (1_52_979494498.arc)

=============================================================
dbvctl ended on DBVS: Wed May 15 01:26:49 2019
=============================================================
Apply archive log on cascaded standby
oracle@DBVS2:/u01/app/dbvisit/standby/ [DBVPDB] ./dbvctl -d DBVPDB_CASCADED
=============================================================
Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b) (pid 21118)
dbvctl started on DBVS2: Wed May 15 01:27:21 2019
=============================================================


>>> Applying Log file(s) from DBVS to DBVPDB on DBVS2:

    thread 1 sequence 50 (1_50_979494498.arc)
    thread 1 sequence 51 (1_51_979494498.arc)
    thread 1 sequence 52 (1_52_979494498.arc)
    Last applied log(s):
    thread 1 sequence 52

    Next SCN required for recovery 885547 generated at 2019-05-15:01:24:29 +02:00.
    Next required log thread 1 sequence 53

=============================================================
dbvctl ended on DBVS2: Wed May 15 01:27:33 2019
=============================================================
Run a gap report
oracle@DBVS:/u01/app/dbvisit/standby/ [DBVPDB] ./dbvctl -d DBVPDB_CASCADED -i
=============================================================
Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b) (pid 28084)
dbvctl started on DBVS: Wed May 15 01:28:07 2019
=============================================================


Dbvisit Standby log gap report for DBVPDB_SITE2 thread 1 at 201905150128:
-------------------------------------------------------------
Destination database on DBVS2 is at sequence: 52.
Source database on DBVS is at applied log sequence: 52.
Dbvisit Standby last transfer log sequence: 52.
Dbvisit Standby last transfer at: 2019-05-15 01:26:49.

Archive log gap for thread 1:  0.
Transfer log gap for thread 1: 0.
Standby database time lag (DAYS-HH:MI:SS): +00:00:00.


=============================================================
dbvctl ended on DBVS: Wed May 15 01:28:11 2019
=============================================================
Conclusion

With dbvisit we are able to configure several standby databases, choose apply lag delay and also configure cascaded standby. The cons would be that the DDC configuration file needs to be manually adapted after each switchover.

Cet article Having multiple standby databases and cascading with dbvisit est apparu en premier sur Blog dbi services.

Why you really should use peer authentication in PostgreSQL

Tue, 2019-10-22 14:36

It is always a bit of a surprise that many people do not know peer authentication in PostgreSQL. You might ask why that is important as initdb creates a default pg_hba.conf which does not allow any connections from outside the PostgreSQL server. While that is true there is another important point to consider.

Let’s assume you executed initdb without any options like this:

postgres@centos8pg:/home/postgres/ [pgdev] mkdir /var/tmp/test
postgres@centos8pg:/home/postgres/ [pgdev] initdb -D /var/tmp/test
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /var/tmp/test ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Europe/Zurich
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

pg_ctl -D /var/tmp/test -l logfile start

Did you ever notice the warning at the end of the output?

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

You might think that this is not important as only the DBAs will have access to the operating system user postgres (or whatever user you used when you executed initdb). Although this might be true in your case, the server eventually might have other local users. Before creating a new user lets start the instance:

postgres@centos8pg:/home/postgres/ [pgdev] export PGPORT=9999
postgres@centos8pg:/home/postgres/ [pgdev] pg_ctl -D /var/tmp/test/ start -l /dev/null
waiting for server to start.... done
server started

You really need to be aware of is this:

postgres@centos8pg:/home/postgres/ [pgdev] sudo useradd test
postgres@centos8pg:/home/postgres/ [pgdev] sudo su - test
[test@centos8pg ~]$ /u01/app/postgres/product/DEV/db_1/bin/psql -p 9999 -U postgres postgres
psql (13devel)
Type "help" for help.

postgres=#

… and you are in as the superuser! So any local user can connect as the superuser by default. What you might want to do is this:

postgres@centos8pg:/home/postgres/ [pgdev] sudo chmod o-rwx /u01/app/postgres/product
postgres@centos8pg:/home/postgres/ [pgdev] sudo su - test
Last login: Tue Oct 22 21:19:58 CEST 2019 on pts/0
[test@centos8pg ~]$ /u01/app/postgres/product/DEV/db_1/bin/psql -p 9999 -U postgres postgres
-bash: /u01/app/postgres/product/DEV/db_1/bin/psql: Permission denied

This prevents all other users on the system from executing the psql binary. If you can guarantee that nobody installs psql in another way on the system that might be sufficient. As soon as psql is available somewhere on the system you’re lost again:

postgres@centos8pg:/home/postgres/ [pgdev] sudo dnf provides psql
Last metadata expiration check: 0:14:53 ago on Tue 22 Oct 2019 09:09:23 PM CEST.
postgresql-10.6-1.module_el8.0.0+15+f57f353b.x86_64 : PostgreSQL client programs
Repo        : AppStream
Matched from:
Filename    : /usr/bin/psql

postgres@centos8pg:/home/postgres/ [pgdev] sudo dnf install -y postgresql-10.6-1.module_el8.0.0+15+f57f353b.x86_64
[test@centos8pg ~]$ /usr/bin/psql -p 9999 -U postgres -h /tmp postgres
psql (10.6, server 13devel)
WARNING: psql major version 10, server major version 13.
Some psql features might not work.
Type "help" for help.

postgres=#

Not really an option. This is where peer authentication becomes very handy.

postgres@centos8pg:/home/postgres/ [pgdev] sed -i 's/local   all             all                                     trust/local   all             all                                     peer/g' /var/tmp/test/pg_hba.conf

Once you switched from trust to peer for local connections only the operating system user that created the instance will be able to connect locally without providing a password:

postgres@centos8pg:/home/postgres/ [pgdev] pg_ctl -D /var/tmp/test/ reload
server signaled
postgres@centos8pg:/home/postgres/ [pgdev] psql postgres
psql (13devel)
Type "help" for help.

[local]:9999 postgres@postgres=#

Other local users will not be able to connect anymore:

postgres@centos8pg:/home/postgres/ [pgdev] sudo su - testLast login: Tue Oct 22 21:25:36 CEST 2019 on pts/0
[test@centos8pg ~]$ /usr/bin/psql -p 9999 -U postgres -h /tmp postgres
psql: FATAL:  Peer authentication failed for user "postgres"
[test@centos8pg ~]$

So, please, consider enabling peer authentication or at least go for md5 for local connections as well.

Cet article Why you really should use peer authentication in PostgreSQL est apparu en premier sur Blog dbi services.

Pages