We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Improving PostgreSQL for Greater Enterprise Adoption

00:00

Formal Metadata

Title
Improving PostgreSQL for Greater Enterprise Adoption
Title of Series
Number of Parts
29
Author
Contributors
License
CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language
Production PlaceOttawa, Canada

Content Metadata

Subject Area
Genre
Abstract
PostgreSQL Enterprise Consortium (PGECons for short) is an organization that consists of major IT companies in Japan, aiming to promote PostgreSQL to enterprise users in the country. Since 2012 when PGECons was established, we have been doing surveys of PostgreSQL's functions and performance to PGECons members to estimate how PostgreSQL well meets their requirements. In this talk, we will focus on some of major requests from the surveys, including enhancement of table partitioning and error messages handling. From the enterprise users' point of view, we would like to share these obstacles behind the requests that might limit PostgreSQL's acceptance, in order to cope with these issues with the community. [Partitioning] More the size of a database grows, the more difficult the design and operation of the system become. For example, a sales accounting application has such difficulty, which is often solved by using horizontal table partitioning. We have surveyed how PostgreSQL can be applied to such kind of applications to the member of PGECons, and evaluated performance of table partitioning. These surveys result suggested PostgreSQL partitioning issues below. larger number of partitions slows query response time definition and usage of partitioned table are intricate From additional surveys about proprietary DBMS usage to the members, we will show what are needed in partitioning and the features of the application areas that are important but difficult to be supported by PostgreSQL. Then, we will point out that enhancing partitioning functions will make PostgreSQL to be spread to the aforementioned areas. [Error messages] When shooting a trouble about PostgreSQL, a user analyzes a trouble to identify the cause based on the error logs and error messages. However, when we analyze a trouble with only SQLSTATE code we sometimes cannot identify the exact cause because an error code is often assigned to multiple different error messages. Also, we have similar difficulty when analyzing a trouble from error messages. To cope with these problems, Fujitsu Limited, a member of PGECons, delivers customers a list of error messages sorted by SQLSTATE codes to search the corrective actions. In the list, an identifier is added to each item so that customers immediately decide corrective actions and communicate smoothly with support staffs. Based on our activities above, we will propose error message systems which will bring following two merits: accelerate decisions on corrective actions enhance the self support by users In this presentation, we will show the needs and actions from members of PGECons as enterprise users, to share the way of improvement, which should help promote PostgreSQL among the community.
Executive information systemOvalOpen setInformationSelf-organizationRadio-frequency identificationMetropolitan area networkValue-added networkTransportation theory (mathematics)Mathematical singularityStandard deviationSystem programmingInclusion mapProduct (business)Arithmetic logic unitDistribution (mathematics)Partition (number theory)Table (information)AreaGamma functionDivision (mathematics)Archaeological field surveyBlogMilitary operationMaxima and minimaMessage passingEmailRepository (publishing)Port scannerFrequencyPointer (computer programming)Independence (probability theory)Multiplication signEnterprise architecturePresentation of a groupLengthElectronic mailing listOrientation (vector space)Open sourceSystem programmingTime seriesPattern languageSoftwarePartition (number theory)File archiverDatabase transactionEstimatorOffice suiteTable (information)AreaCASE <Informatik>LoginFrequencyCartesian coordinate systemAnalytic continuationDatabaseFile formatFunctional (mathematics)Archaeological field surveyFlow separationQuery languageNumbering schemeEmailTimestampSelf-organizationPoint (geometry)Computer fileMereologyMechanism designIntercept theoremTelecommunicationStudent's t-testStress (mechanics)MomentumGroup actionForcing (mathematics)Observational studyParameter (computer programming)Software testingFinite setDisk read-and-write headAxiom of choiceBranch (computer science)Interpreter (computing)Basis <Mathematik>Endliche ModelltheorieCore dumpDependent and independent variablesVector potentialInformationField (computer science)Physical lawRoutingInference engineSource codeState of matterGame theoryTask (computing)Form (programming)Computer animation
Partition (number theory)Division (mathematics)Graph (mathematics)Metropolitan area networkGamma functionPerformance appraisalKnotDistribution (mathematics)Overhead (computing)Function (mathematics)CompilerMathematical singularityPattern languageTable (information)Product (business)LogarithmInsertion lossData modelMilitary operationMaxima and minimaStatement (computer science)Distributive propertyData acquisitionIcosahedronCache (computing)Curve fittingSoftware testingSingle-precision floating-point formatVideoconferencingTime seriesMereologyRight angleVideo gameRule of inferenceData modelProduct (business)Beat (acoustics)Physical lawGroup actionPower (physics)Fluid staticsEndliche ModelltheorieTheoryForm (programming)Logical constantView (database)Numbering schemeBit rateCartesian coordinate systemPartition (number theory)Execution unitForcing (mathematics)Pattern languageDependent and independent variablesSummierbarkeitLine (geometry)Functional (mathematics)Theory of relativityEvent horizonOcean currentMultiplication signQuery languageConnectivity (graph theory)Branch (computer science)Flow separationCASE <Informatik>Table (information)Focus (optics)Information overloadBellman equationOperator (mathematics)State of matterAdditionType theorySoftware bugDecision theoryPairwise comparisonPoint (geometry)Menu (computing)Range (statistics)VorwärtsfehlerkorrekturElectronic mailing listConditional probabilityInsertion loss2 (number)Distribution (mathematics)Data storage deviceElectronic data processingSingle-precision floating-point formatPerformance appraisalRow (database)Software testingAnalytic continuationDatabaseHash functionOverhead (computing)Dynamical systemTupleComputer animation
Metropolitan area networkDirected setCache (computing)Function (mathematics)Insertion lossImplementationPartition (number theory)Software testingServer (computing)Storage area networkOvalMathematical analysisBit rateLogical constantDistribution (mathematics)Port scannerGamma functionTable (information)Single-precision floating-point formatZoom lensValue-added networkMaxima and minimaPartition (number theory)MereologyStatement (computer science)Dynamical systemRight angleSource codeParameter (computer programming)Key (cryptography)Functional (mathematics)Numbering schemeProcess (computing)Task (computing)Table (information)Point (geometry)System programmingQuicksortMultiplication signCausalityAdditionInsertion lossProduct (business)Decision theoryBroadcasting (networking)Boss CorporationInformation overloadQuery languageBit rateDistribution (mathematics)Social classComputer programmingRule of inferenceSet (mathematics)Software testingResultantGraph (mathematics)Figurate numberCASE <Informatik>State of matterCartesian coordinate systemAreaCall centreCuboidKinematicsType theoryGroup actionSystem identificationHypermediaSeries (mathematics)Scaling (geometry)Line (geometry)Logical constantFluid staticsSequenceException handlingRaster graphicsPrime idealSubject indexingError messagePredictabilityMultilaterationPort scannerRow (database)Login1 (number)Cache (computing)Patch (Unix)Interface (computing)Formal languageDirection (geometry)PlanningTheory of relativityDifferent (Kate Ryan album)Source code
Port scannerFunction (mathematics)Insertion lossDistribution (mathematics)Performance appraisalPartition (number theory)System programmingStatement (computer science)Mathematical singularityCache (computing)ImplementationGamma functionMetropolitan area networkMathematical analysisTable (information)Positional notationElectronic mailing listInformation systemsDistributive propertyBinary fileWechselseitige InformationTotal S.A.CodeFrequencyMilitary operationEvent horizonDynamical systemPerformance appraisalPoint (geometry)Partition (number theory)Electronic mailing listRange (statistics)Table (information)FrequencyIdeal (ethics)Line (geometry)Distribution (mathematics)Product (business)Time seriesInsertion lossContent (media)DatabaseNumbering schemeQuery languageEnterprise architectureBoolean algebraSubsetFunctional (mathematics)CASE <Informatik>Boundary value problemConfidence intervalOperator (mathematics)Mathematical analysisExclusive orSystem programmingPositional notationVideo gameError messageSoftware testingPlanningStatement (computer science)Event horizonMoment (mathematics)Partition function (statistical mechanics)Subject indexingRow (database)Domain nameAreaMultiplication signStudent's t-testRootAuthorizationPrice indexMereologyBit rateData structureAdditionRight angleOpen setSampling (statistics)Scaling (geometry)Object (grammar)Pattern languageStaff (military)Reduction of orderMultiplicationBinary codeProcess (computing)InferenceACIDCellular automatonTerm (mathematics)ConsistencyWater vaporPhase transitionCausalityBounded variationForm (programming)Mechanism designSource codeComputer animation
BackupData managementMilitary operationSystem programmingCausalityMetropolitan area networkCASE <Informatik>DatabaseDependent and independent variablesSoftwareExecution unitState of matterSystem programmingSystem callOperator (mathematics)VorwärtsfehlerkorrekturSystem identificationComputer animation
Service (economics)LogarithmInclusion mapInformationMetropolitan area networkMessage passingWorld Wide Web ConsortiumExt functorSystem programmingDependent and independent variablesGroup actionGamma functionCodeCausalityKnotExpert systemMathematical singularityServer (computing)Data recoveryDeadlockPattern languageStatement (computer science)MultiplicationVariable (mathematics)Fuzzy logicFunction (mathematics)Maxima and minimaRegular expressionLevel (video gaming)Port scannerDataflowResultantMultiplicationLoginQuery languageError messageMeasurementDependent and independent variablesMereologyAreaProcess (computing)Function (mathematics)Graph (mathematics)Message passingAdaptive behaviorCondition numberData compressionServer (computing)FehlererkennungPattern matchingCategory of beingFuzzy logicCASE <Informatik>Numbering schemeInformation securityVacuumVorwärtsfehlerkorrekturData loggerInformationSlide ruleMultiplication signSource codeProduct (business)CausalityWebsitePoint (geometry)StatisticsRun time (program lifecycle phase)System programmingWeb 2.0Level (video gaming)Revision controlProcedural programmingMixed realitySystem identificationGoodness of fitRegular graphSoftware testingSampling (statistics)Frame problemExpressionHypermediaBitInsertion lossExecution unitData miningRule of inferenceNormal (geometry)Operator (mathematics)Interface (computing)Cellular automatonShooting methodComputer configurationDatabasePrice indexForm (programming)CoprocessorDensity matrixComputer animation
LogarithmRegular expressionPattern languagePort scannerLevel (video gaming)DataflowMessage passingFunction (mathematics)Maxima and minimaKnotMilitary operationIdeal (ethics)CausalitySystem programmingGroup actionWebsiteClient (computing)Data recoveryModule (mathematics)Computer fileQueue (abstract data type)Table (information)Read-only memoryMetropolitan area networkTranslation (relic)Server (computing)Different (Kate Ryan album)Social classWeightGamma functionLine (geometry)Pole (complex analysis)CodeFront and back endsBit rateLevel (video gaming)Point (geometry)Message passingPattern matchingDifferent (Kate Ryan album)Data compressionState of matterConcordance (publishing)Task (computing)ResultantFunction (mathematics)LoginData managementComputer fileLibrary catalogCASE <Informatik>Table (information)Connectivity (graph theory)MereologyOpen setError messageDevice driverGroup actionElectric generatorMechanism designWebsitePattern languageMatching (graph theory)Right angleProfil (magazine)Data structureBitClient (computing)Server (computing)Source codeMultiplication signShared memoryGoodness of fitOperator (mathematics)Uniqueness quantificationOpen sourcePairwise comparisonComputer programmingCondition numberFilm editingConstraint (mathematics)VarianceElectronic mailing listAnnihilator (ring theory)DiagramLine (geometry)Content (media)Computer animation
CausalityMilitary operationMessage passingSystem programmingGraph (mathematics)LogarithmDataflowString (computer science)Dependent and independent variablesData typeKnotValue-added networkExpandierender GraphTime evolutionSoftwareResultantMessage passingDirected graphSource codePoint (geometry)EvoluteComputer animation
Transcript: English(auto-generated)
Oh, OK, it is time for presentation, OK? Can I start speaking? OK, OK. Ladies and gentlemen, my name is Tetsu Sakata from NTT Japan. Today, we will talk about improving PostgreSQL for greater enterprise.
At the beginning, for me, my name is Tetsu Sakata. I work for NTT Open Source Software Center, working with PostgreSQL for about 10 years. And this is Yurie Enomoto.
She works for Fujitsu Limited Corporation. And we are a member of PostgreSQL Enterprise Consortium. It is an organization that consists of IT companies in Japan, like Fujitsu, NTT, NEC, SRE, OSS, and so on.
All the company is 16. Detailed information, you can get here at NTT. We have already given a talk in PGCON 2013.
So before talking on topics, I'll introduce the situation of PostgreSQL in Japan briefly. In Japan, OSS is applied to enterprise systems widely, too. And at early stage, PostgreSQL was applied to peripheral system of enterprise.
Now, it is applied to enterprise mission critical systems. According to Fujitsu's survey, they provide about 5,000 systems based on OSS software
into Japanese companies. So now, taking such advantages of our know-how about mission critical systems, our PostgreSQL Enterprise Consortium aiming for more adoption of PostgreSQL
to enterprise systems. Today, we'll talk about two topics based on our consortium activities. We surveyed functionality and performance
in credit formation critical systems area and found the following items should be improved. One is partitioning facility. The performance of queries on partitioning table is not enough to handle many number of partitions, such as several hundreds.
The second one is facilities for troubleshooting. This will be talked about in detail after. Several issues remain, but this talk will be some day.
Thank you. As I mentioned before, I talked with PostgreSQL for many years.
We have talked about PostgreSQL. Our customers sometimes complain PostgreSQL is not good at its partitioning facilities as commercial databases. So to clarify PostgreSQL issues, we studied its usage and there are two points.
One is what are typical usage patterns and cases partitioning. The second point is what are required to partition features.
So we did a small survey by sharing to PostgreSQL Enterprise members. In this survey, most systems are intended OLTP, online transaction processing system applications. Of course, there are many OLAP applications
that have different requirements though. And we found two typical patterns of partitioning usage. One is logging times its data and two,
data divided into branches. This is a coined name for this presentation. The first pattern, logging data. We found many large databases store time series log
data like logs. To the databases, log data inserted continuously. After, the log data are aggregated continuously. For example, once a week or once a month and so on.
And finally, log data are discarded periodically. For example, once a year and so on. Such a case is found where we can see such cases. One is the telecommunication area.
My office entity, Open Source Software Center estimate about one third of our company's systems operated like this pattern. Other PostgreSQL Enterprise consortium members pointed out such a pattern is often
seen in research on the financial systems because that system should be handled much business transaction data. It is spawned or created periodically and continuously. And others, some email archival systems
that should be done in such a short time to use partitioning for the purpose.
Next, I will talk about this pattern in detail. Log data arrives continuously. Log data arrive continuously to the databases. They are distributed to a partitioning according to timestamps.
For example, data arrived in 28 June. They are stored here. In 19 June, here and so on. When aggregated data in a specified period, we have only to look at a particular partition
and not hold all the partitions. For example, we sum up data in a day. We only have to check this partition or this partition and so on. And when discarding data, we only
have to truncate partitions. And doing so, we can discard data in a short time. We dispense vacuuming tables after.
And next to the pattern, too. The pattern are named, data divided into branches. A database using financial applications store data in partitions. Each partition stores data related to one branch
only of a bank. And the number of partitions is about 1,000. Of course, it depends on the number of branches a bank has. In this case, data related to branches
can be handled or can be queried independently. Of course, there are several queries for aggregation that scans data from many partitions.
So according to our small survey, common situation that partitioning use is as follows. At first, as to database size, database has a data larger than 1 terabyte
requires partition facility. And about typical application, as I mentioned before, these two type is typical applications. Logging times the data and the data processing divided into branches.
And the next question is, what kind of queries used frequently? Most of the queries search only one partition to look up data or aggregate them. Of course, on the other hand, several query aggregates data from more than one partitions to join tuples between partitions.
We have depicted the situation where partitions use. Then let's talk about the main requirements for partitions.
First, number of partitions. It requires about 1,000 partitions. And we think 1,000 partitions is enough substance for most applications.
For example, in logging time series data, if we use one partition for a day and we have to store data for three years, amount of partitions is about 1,000. And the next requirement is speed of query execution
is required. Is the definition required too? And second requirement for partition. What kind of partition rule is used frequently?
As far as we know, those are range partition and the risk partition. For logging time series data case, a partition is divided into range partition rule. And divided data into branches.
Such branches are listed, so we use a list partition. Some applications require multiple partitioning. For example, data into divided branches. And each branch has data that is divided time series
by range partitioning. On the other hand, hash partitioning is seldom used, as far as we know. And third point, size of one partition. As far as our survey, this matches to some dozens or millions of tuples.
In other words, the data size is about from 10 gigabytes to 100 gigabytes. So we have clarified the situations and the requirement
about partitioning. Then we would like to know its performance issues. Post-wholescale Interference Consortium evaluated the performance of partitioning using a loading time series data model,
or loading time series data patterns, to clarify performance issues. We use various number of partitions to know slowing down code by partitioning overhead. These numbers are 30, 90, and 150.
And simultaneously, we tested various methods to distribute data to partition, to know the overhead of the partition. The method static function, dynamic function, and function using syringes.
These are expected in detail later. This is our evaluation model. At first, operation pattern is logging the data.
In this model, we log data from a group of products. And they are stored in the table continuously. The log arrives, and they are stored in partition tables.
A partition is prepared for data arriving in a day. So this partition is data for that arrived in 68 June.
And this is 69 June, and so on. Data is stored for one, three, or six months. They are 30, 90, or 180 partitions are used, respectively. And evaluation scenarios, we have two scenarios.
One is data logging or data insertion scenarios. Three functions for data distribution are used. They are implemented through a function for performance comparison. The second scenario is data aggregation.
A query is pressed to aggregate log record for a month. Next, I'll explain them in detail. This is a model of data logging test.
The amount of inserted data is, sorry, three cases of amount of, three cases of data insertion tested. This is case inserted, 90, sorry, 30, 90, and 180 data.
Data distribution method, three data distribution method is used. Measured insertion time for the same data into a single partition, so single table for comparison.
Three distribution method is as follows. The first one is static distribution function. That select a partition by a bunch of hard-coded if then rules for each partition if it is suitable to store the data.
The second one is dynamic distribution function. That creates an insert statement for the suitable partition based on the partition key included in the inserted data. The third one is function with prime cache.
This creates an insert statement like a dynamic function. We use the executes to plan of the insert statement. This is implemented in C language to use prime cache via SPI interface.
Here is the result of the data insertion test. This is a result figure. The blue bar, a small blue bar, is the time for 30 patches.
And red one is 90 patches. The orange one is for 180. And a vertical axis shows elapsed time for insertion. That is shown in time. And this line shows 24 hours or in one day.
And horizontal line shows type of distribution function. These three bars for static, they are for dynamic and they are for prime cache. And finally, the right-handmost ones for time direct insertion.
The last time for data insertion, the longest one is static function. The second longest one is dynamic function. The distribution function is prime cache
outperformed the others. It is very fast. However, static function was very, very slow, prohibitively slow. So we gave up testing our case for 180 days. So the dynamic function,
better than static ones. And it is easy to implement. Function with prime cache, it is very fast. But it is difficult to implement. It requires more effort to implement. And if there is a bug, that would cause a sub-crash.
So it is difficult to test. Or it is important to test.
The final point. Function with prime cache is very fast, but difficult to implement. That shows possibility scale should be provide first and reliable way for data distribution is required.
We will repeat this point later. To predict possibility scale performance for a larger number of partitions,
we analyze insertion times for partition, for one partition. We compare two methods except static functions with regard to insertion time to fulfill one partition. Those methods insert pathways into partitions in almost a constant rate,
or constant rate while the number of partitions changes. This graph shows a particular axis shows elapsed time for insertion. And the horizontal axis shows a number of partitions.
If the line is horizontal, the insertion requires constant time regardless number of partitions. The dynamic function, this is showing blue line.
That shows growth slowly as number of partition is span of a partition increase. And this might be a latent problem if we handle a large number of partitions.
Next, let's talk about test for aggregation. We use a query for aggregation. Test that schema, sorry, that schema, sorry, that scans log record, log record.
Log records in specified amounts and aggregate them by a product ID and error code. Here we introduce the schema we used. There are two tables. The one is table for access log.
That is partitioned in number 30, 90, or 180. And the other is product master. And this table connected via product ID from access log to product master table.
So we tested the query against 30, 90, and 100 partitions. And this is the result of it.
In that case, oh, sorry. Sorry. We tested aggregation test with three different kind of scan method. One is index scan.
This is indicated in blue. And that is by bitmap scan, red. And orange, sequence scan. And this graph shows the result. And next, to predict, sorry, to predict aggregation performance
for a large number of partitions, we analyze aggregation time, aggregation time's growth against number of partitions. To know the growth, we calculated relative elapsed time with 30 partitions being unity,
being unity at this point. By each access method, the result is shown here. The vertical axis shows elapsed time, too. And horizontal axis shows number of partitions.
From this figure, we know relative time or relative time with sequence or access, this line, orange line, increases as the number of partitions increase. This might be a right-hand problem for a large number of partitions.
Let us summarize the evaluation. We did performance evaluation that simulates logging time series data application, which is widely used in enterprise. As to the data insertion test, data insertion can speed up by distribution function
with dynamic insert statement creation while using Plancash. With these functions, we estimated about 200 partitions can be used for production systems. However, from the viewpoint of usability,
according to evaluation, the evaluation for three data distribution methods, a static function is the easiest to implement.
However, it is very slow. And the other two methods are effective to process query. However, it is more, sorry, it is more difficult to implement.
Next, we summarize data aggregation test. We evaluated data aggregation speed for the table
that has 80 partitions. As to this test, PostgreSQL aggregates data from that table enough quickly. And then we predicted PostgreSQL performance for more partitions based on the data 180 partitions case.
This brief analysis shows queries against larger number of partitions will take much longer time. According to hearing to our consortium member, queries against the table that has more partition,
much more partitions, for example, 500, cannot be executed as quickly as this case. This shows we need improved planning and execution, such as content exclusion. So finally, I will point it out our waiting solutions.
Concerning data definition, we will need a kind of explicit syntax. I mean, easier definition of partitioning is desired.
For example, dedicated statement for partition definition. In that mechanism, boundary checks should be automatically and list, range and list partitioning used popularly. So they should be defined easily.
And a certain notation for a large number of partitions is required. Ideally, we'd like to handle 1000 partitions. So, definition for 1000 partition in a few lines is ideal.
For example, range partition on time series domain can define partitions for days, months, and so on by simple syntax. List partition on the given table can handle, sorry, each partition on the given table of possible SQL can handle,
so can define partitions for each rows of the table. Next, and then I'll point out our waiting solutions for query execution. At first, data distribution.
At first, data distribution. By now, user defines trigger functions to distribute data to partitions. However, partition function should be built-in and accelerated.
It is desirable, speeding up binary search or indexing. About query processing. Query for a table has a large number of partitions, should be executed more effectively. Similar to a data definition,
looking up a partition for given value should be accelerated. This should be done easier if we have built-in structure, built-in data structure. This is that I talked.
Please wait a moment.
Sorry for waiting.
From here, I will talk about improving error code.
Hulitsu provides database products based on PostgreSQL for enterprise. Today, I'd like to introduce our effort and proposal for the future. Firstly, do you think what needs
for mission-critical system? We have three points. First point is non-stop, to avoid big damage on the social life in the world. We must not stop the social infrastructure system.
Second point is speedy. If the system fail occurs in the case of financial, the half a day stop become the news and may lose the confidence. In the case of our company,
our public support people need to last to the customer so within two hours after failure acceptance. Third point is a stable operation.
It is essential even if the busy period on the season event comes. Of course, 24-7 support, including these three points, is needed.
For the achievement, non-stop, availability is important. For achievement, speedy, responsible identification for code and solution are important. For stable operation, it is important to always monitor the performance
and database operation status to avoid system failure. Because not only PostgreSQL itself, but linked software and tools are well supported for these three points, we think it is essential
to collaborate with PostgreSQL and its linked software tools. So we think PostgreSQL log has a significant role to collaborate with linked software tools.
So we can get the various information from the PostgreSQL server log. So for example, for a live monitoring, so we need to check the process running and SQL response and so on. For responsible identification for code and solution,
so we need to analyze the detail of error log. So for stable operation, so we need to check the runtime statistics and the response time, such as slow query and auto vacuum. So also need to check access logs for security
so when, who, from where, access to which database. So today, I especially focus on the quick troubleshooting using error log.
So by the way, so this graph shows the questionnaire result by PZCon's adaptive difficulty to adapt for mission-critical area. So in Japan, many people want responsible support to use for mission-critical system. Furthermore, so PZCon asks the additional questions.
How you solve when system error log? So this graph shows in case of the system using non-PostgreSQL, including commercial product. So almost users not only call for the support,
but search on website or manual and message collection books for troubleshooting by error message techist or error message ID so for themselves. So on the other hand, so regarding PostgreSQL,
so there is a high tendency of investigation by the error message techist. So web research is a good solution procedure because we can get many shared know-how on website. So but it would be better to aggregate them.
So also, we think that solution by graph code, so IGU, is a significant point. But so there are two issues. So it takes time to identify the cause and expose skill and experience are needed.
So I will explain the details, so next slide. So I will introduce our experiences. So in the case of graph code by error code, so much error messages are assigned to one error code.
So for example, so error code indicated that graph detectives exist two parts on source code. So because it is different solution, so for each pattern,
so user needs to take a measure of the solution, so it is difficult for beginner. So also there are 120 error message, so not assigned to each error code. But so please note that this is 20.9 result,
version 20.9, sorry, version 9.2. And in the case of the error code by error message techist so there are 981 message patterns,
so that the output same error message in multiple source code. So for one message, so the maximum number of duplicated source code, so 54. And when it is included the variable number
in the message, so it is need to narrow the search by fuzzy condition and confirm the multiple source code. Also, I will introduce the difficulty
of Postgresical log. So mainly, so all log message output to one log file, so it is not be able to do the log routing by error level or category. So for example, so error trouble and slow query log,
so it's a mix and scrubbing. Additionally, it is not able to specify the detail condition and monitor by the pattern matching of regular expression, so, actually.
Like in the case of using the bits open source, so we can catch the serious error level, so such panic, fatal error, so by the pattern matching, pattern character matching. But it is difficult to catch the significant slow query log.
So based on this, so we think there are three points for either operation of log message. So first point is grant of message ID. It is best that the error generation part
is understood as a component level of the source. And improvement of the detection task in operation, so monitor by local, the pattern matching.
So second point is uniqueness, so how to grant of message ID. So one code, so one message ID is best. And so we also think the reconsidering of the detail of the SJS state. So third point is accumulation of know-how.
So aggregation of troubleshooting know-how, so sharing by message collection book website. So this table shows a classification example of condition,
so based on user action. So difference to SJS state and our original performance tuning level, tuning point. So the right is PostgreSQL, and the left is our classification example.
So we also saw some classification items. So next, I will explain our original method
to grant message ID to PostgreSQL. So we use message catalog, and we support two message catalog structure. So one is the profile and another one is the left file.
So output message mechanism is different between server and each kind of client driver. So this left figure, example of airport.
So in our experiences, so there are some difficulties, so we experienced it. So for example, because there are message granted, so wrong error code,
and mismatch between STL state and kind of error, so we need to review internally. So in the case of 9.2, so our experience total, so there are 300 messages.
Additionally, the messages are not created catalog and supported, so multi-language, so we need to recreate the message catalog.
So total 1,072 messages. So next, so I will explain how to manage message ID when PostgreSQL is upgraded. So because we have original error message ID,
so we need to maintain by original tool in PostgreSQL upgrading. The role of tool, so first point is, we catch the extract message difference and calculate the mechanically so-called concordance rate,
so based on the vocabulary. So second point, the tool will show the same ID or new ID according to the calculated concordance rate. So because it is matched according to the character,
storing comparison, so the result is how do you show influence when message output is changed in source code.
So conclusion, so result by message ID, so that's all. So we engage the further PostgreSQL evolution,
so three points, so that's all. So thank you very much. Questions?