These Add Task-- all those little tabs over to the right?
Yeah. So for example, let's say-- or do one with where there is some more information like this. So for example, here, you could-- on a customer, you can add a task, could do a task, or request an access, or provide an issue, for example. So each one of these you can click on and go into, and make updates, and then save.
So for example, if you were to do an issue, let's say, you can make a description. This appears to be a duplicate. And then please delete. No maybe delete, "Please investigate."
Are you assigning it to someone in particular?
Yeah. You can assign it to-- within the system, you can assign it. Yeah. And then, when that person comes in, that person will be able to see that in their space, if you will. For example, where can I show you that? I'm sorry for jumping back and forth.
That's OK.
Yeah. So for example, here's my Action Center. And I know that I have this mainly because I created these specifically for a persona. So there's an issue and there's a request. So you can do that, and you can-- so it comes to you, because you've been assigned, and then there is a place where you can look at a whole list of tasks that are assigned to you. Some have dates that was assigned to so that either you are overdue for that and then you can complete that, or you know that something is coming up, and you can actually take a look at that information.
That's a nice feature.
Yeah. So from a collaborative perspective, you need to be able to do that, obviously. And from a good data governance practice-wise, there's-- always should be at least two or three eyes-- four eyes, at least, on it. So stewards, data owners-- the data governance group usually has a overarching, across the business units feed into that, so that could be all designed into the steps.
Do you have a data governance dashboard or do you build it yourself?
So it's a very interesting question you just asked me. I can't show you how to build one here, but there are dashboards that you can build. You can even build it in Power BI and actually have data come from something like this, which-- I can't show you that either. But you could build definitely a dashboard for it.
There was some other year-end policies now, and it only had limited actions that you could take. Go back to the tables.
OK, sorry. Jumped all the way up. I can't see who's all logged in, but--
OK, so we've got--
--started jumping all over the place.
The Add Task is really your workflow. Is that right?
Yeah. So this would be where I think data stewards will come, and they will be able to assign task, or create, or curate. All of that kind of stuff could be done here. Excuse me.
And then can you go to the lineage?
Yeah. So some of them may not have any lineage at all, and some of them might. So this one just happens to be that it has lineage. So then you could [INAUDIBLE] through it, obviously. And let me see if there's anything very interesting for you that you can see.
Now, I see you're going through-- so MongoDB to Snowflake to Tableau. So you're crossing environments.
Absolutely.
How does that work?
Very good question. So the technology comes with data at rest, if you will, like MongoDB, Snowflake. We can go and look at their system tables and build a catalog so we know where they go. But what we have are the smart connectors that will connect the databases to the actual reporting environment, like Power BI. So if there is code from Snowflake to Power BI, we could actually look at that ETL and do the connection for you.
So by looking at the code that says source from here, target there-- and the code doesn't update to a table or something like that it will be able to do that connection for you. So what we differentiate is that we have connectors, and everybody does as well. But the number of connectors that we have that will give you insight into how the information is put together this in automatic way is something where we excel in. That's where we have the z connectors that I know of.
I think, by far, we're the best at that. It's one of our huge differentiators that we have that. So it's an in-house team that does it. So if new technology comes up that-- we're doing a lot of work with Databricks right now, and data vault isn't very new, but we have a very specific IP around it to help organization really quickly realize this type of environment, this type of picture.
So from regulatory reporting-- I've done this-- you can print these, and then you can actually put it in attestation documentation that happens. So Carlyle will probably do a lot of attestation from a financial perspective to say these investments that we have, for example-- let me see. Let me go back to that again. For example, let's say opex. You will be able to show specifically how that information was sourced, what happened.
So if there was any ETLs in between that happened here-- and this is not one that I'm really familiar with, this lineage. But if it did, then it would show. Remember? I don't know, Carla. Do you remember seeing a little bubble with the a T on it? And that's the transformation rules that you can hover over that you can actually look at. And then, obviously, these pictures can be downloaded for attestation or just sitting down with the person who's using the information to say, you know what, this opex, if a data steward or data engineers come in and take a look and say, how was that calculated? You can actually show that source, this lineage. And both business and technology people can be in the same room while we go through that information.
OK. This is automated? There's no manual API stuff that needs to happen to make this--
Yeah. So we have APIs. A lot of the technologies are really based upon an API technology. But is it all automated, 100% automated? Obviously, everybody strives for 100%, but I think we have the highest automation of any systems. Even if you put stuff together, there may be some code that's specific to, let's say, that technology or that environment. There may be some caveats.
So when you go from 60% to 70% automation on things that had no automation whatsoever-- is something that we are still trying to solve. So this is the thing that I would say is that I really don't know how many percentage or how much that we can do. It really depends on the technologies that you have. If you have things like Oracle, it's very established. We have very mature scanners for it, so we will be able to get a lot of information, PL/SQL. We could be able to-- because they've been around for a while.
But you're doing something like, let's say, Databricks. And you are doing Unity Catalog, and these are all new technologies that are changing. Well, obviously, our scanners hasn't had the maturity, people using it multiple places, so those aren't going to be as automated. But what I would say is that we will try to get them automated. Each site might have a slightly different thing that we-- professional services will help.
OK. So when you said that you're growing your connectors in-house, that's not open-source then? That's just through you?
That's the thinking. Yeah. Yeah.
And then they used Crystal Reports. I don't know if I mentioned that before. I don't know if there's a plan to move away from that, but is that something that you--
Crystal Reports that-- Let me see. So if you come here, Data Modeling Connectors, that would be on the ETL. So there, and I can get back to you without even looking at it. So Brad, if you wouldn't mind just putting the note down for Crystal Reports. And then I can do some research on it. But Carla, this would be a great place for you to look at where all of our [? GA ?] scanners and what we do, like reverse engineering versus forward engineering.
This component would be helping you to do the reverse, like Databricks from where it sits to how we get that information back. And there's also a test automation component. I don't know if they are into that kind of stuff. But so this would be a great place. And again, I can't see who's logged on. But if you have a resource who is more technical, you guys can come and take a look at it.
And if you don't see it, if I don't see it, then I will let you know whether it's on our roadmap or it's not something that we're going to be supporting because it's too old. But here's the thing. I've done data governance for a long time, and every tool, if you can't connect to it, there's always an extract file that you can get from. But that is then something that you have to orchestrate, ensure that the files that you are dropping has the right permission and right logins, is in a place where only certain people or certain processes can pick up. And the cron jobs has to be orchestrated so that new information can come in based upon some sort of a trigger, whether it's time-based or a change-based. So yeah.
Now, are there different personas when you log in? Or do you all just log in a single way and you see the same pages?
And that's one area that I don't know that well, and I don't know if I could view a lot. But each one of these persons, as you create them, you can associate a assignment, if you will. These are all configurable, so you can have-- I don't know this product as well, but you could have various different granularity and users as you create them, give them different-- for example, data steward user ID, who's using it, and things like that. And they might have access-- very much focused on how that is set up.
I know for a fact it-- you can create as many different personas, if you will, that you would like. All of these could be defined differently, for example. You can have people who are data stores and one or two of these catalogs or multiple catalogs, for example. And all of that is configurable. Have you ever read or do you know-- have you ever seen a thing called DCAM.
Yeah.
OK. Within that assessment, there is a-- one of the components in that is the rights and users, and there's a talk about-- of a RACI component. So it supports all of that. And to be honest, we work with a company called Compass. And what they're doing is they're taking DCAM, and they're actually turning it into a-- and they tightly couple the DCAM with our solution. So for example, if you want to do component one, let's say the data governance organization readiness component, you can actually have all of that and build, and they will help you to build DCAM-like workflows for that. But obviously, if you know DCAM, then you already know what component needs to be in there. And then you can actually build that yourself.
But we have a partner called Compass that will also help you with that as well. So what I mean by that-- we give you a solution, but if you're not organizationally ready to do data governance, then we can help you with that. However, I think-- was that Maureen that we spoke to?
Monica--
Monica. There are two people. When I listen to them, they're not only data governance aware. I think they're actually pretty entrenched. So as far as the facility in which to create that, RACI is here, and it's flexible. I just can't really show you how to do that
because I've never had to. But you can do that.
So this is permissions and rights per individual or role. And then is there a dashboard where we would-- is it just the home page where we see how many assets are connected and-- or is there another page where we would see percentage complete or something like that?
And I apologize.
That's OK. I'm not even-- I mean, you know it better than we do.
But what I didn't want was-- which is what is happening right now-- which is me going through this information in a-- is that I wanted definitely somebody who is well aware of the system to actually be able to do that.
OK, so the job scheduler. It's OK if you're just clicking through here. That way, I can see, because I haven't seen the whole thing. So OK, so this is your data governance dashboard?
It's a metadata manager. So I wouldn't say this is data governance. And I'm sorry for saying it like that. But this would be more all the sources, where they come from, and what are some of the artifacts that are in this data catalog, distribution, data lineage. Which one has the most data lineage? Data quality profiling also is here, in the sense that if you click through it, it will have all the data that you pulled in as part of the data profiling.
| the examples that you saw in the demos and whatnot-- we can see a holistic view of the data quality, if you will, data profile that's been profiled. And then sensitivity-- so these are all I know. I mean, I'm sure there's some sort of a panel that you can create. So when you say a data dashboard, you can do something like this that is out of the box, but you can also probably build the dashboard--
Custom.
--custom thing, because it would be-- I would imagine you could-- it does seem like it has some things that you can actually change.
Yeah, I bet the tiles-- you probably can replace these tiles and move them around.
Mm-hmm. And being able to collect table names, which ones are sensitive-- they have sensitivity marks on it. So these would be very important, I think. And it could be sensitive. It could be other tags that you want to put on there. So that would [INAUDIBLE] if you need to customize [INAUDIBLE].
And do you provide code to do the automated tagging or classification? Or is that something that you've got to--
No, no.
--use [? Red ?] [? Jax ?] or--
So a person who has the access and permission to tag, they can do that. So a data steward, they would be able to do that. Now, if you're asking me being able to look at a data source and when you ingest it as you ingest it, whether you are identifying PII information or not-- is that what you're asking?
Yeah.
OK. So that would be something that I think, within our DQ, that we could do. I've seen some demos of it as early as last week. So that's something that we can teach. There's some AI component of it as well. But I've seen it high level, because I was in between meetings going back and forth, but I did see some of that. But if you're interested in that, we can probably provide some information on that as well.
OK. Yeah, I'm curious if it's pattern matching, like if you're doing name recognition or if you're actually looking at the content and patterns, like social security number, something like that.
Actually, within our solution, we do that. And I believe Nadeem demoed it toward the end of the demo, because there was a question on it. Yeah. So what that is is metadata being connected to the-- so catalogs being connected to the glossary, for example. The business term, for example-- let's say we use the term customer. Where is that actually connected in a lineage diagram is there's a AI component that connects that. I think he basically clicked through it. The likelihood of customer being in these tables-- go and take a look at them.
And you can do the same thing for Social Security. Then it will require somebody to actually connect those two things together. But that's better than actually somebody sitting there looking at 300 assets and trying to figure out which one is which. So the AI really removes a lot of that heavy lifting for you.
OK. Saikat, did you have anything that you wanted to see in particular or anything that we ought to take a look at while we've got him here?
Yeah. So most of the things that he's covered, so I don't have anything right now, specifically, to ask.
Saikat, were you on the first demo as well?
I was there, yeah. The one that Nadeem gave, right?
Yeah. Yeah, yeah.
I was there. Yeah.
Were there anything there that you had questions about?
No. I like the mind mapping features. I mean, the one that he showed the business lineage thing. So yeah, most of the questions that I had was covered on that session as well. I think he also touched over the technical lineage as well. So this mind mapping is mostly on the high level on business lineage, not the technical one. Am I right?
Yeah. So for example, this production environment, it can go and take a look and actually look at its lineage. And this is the thing that I think he was showing you, right? So being able to look at all of these different things and being able to do stuff like that, and being able to see things like this-- so if I take a look at this, we have scanners for Snowflake, SSIS, obviously. And this probably is Oracle Database and BI.
So we have out-of-the-box scanners for all of this. Now, if you go and talk to people, and different vendors and whatnot, what they're going to have problems with-- most of them are going to have problems with ready-to-open scanners, the [? GA ?] scanners. A lot of them don't develop them themselves. So the lineage-- one of the things that everybody wants to really see-- and here's the ETL that I was talking about. You can go and look at-- and we don't give you the entire code per se, but we know well enough to understand that, within this diagram, we'll give you an excerpt, if you will.
We won't give you a lot of the selects that would be actually something that we would see-- assume, but the actual business rule, as in how it actually represents what altercation that has gone through, this would be shown on here. So Carlyle, being a financial institution-- I've been in banking for about 35 years, and one of the things that I used this for was for regulatory and other data exams, where I can go and sit and look at being able to show this, from beginning to end, how it was consumed, and being able to say, for example, how some of these reports were calculated.
You should be able to see the business rule around that. So you can say principal balance, at least from a lexical definition, from a data governance perspective, from-- is XYZ divided by 4, all square rooted by 2? You can be able to see that. So see how powerful that is. So you can sit down with somebody who's consuming information, and if they had questions about anything-- so address, for example-- not only can you see the information, but you should also be able to see the transformation that was done.
I know this is address, but just think about if this was something else. And then the DQ score also is hugely important, because what you're doing from a consumption perspective-- you want to look at-- this address being 50% with data score perspective, it's an issue. So for example, if you're doing a percentage of some calculation, and the data that you're looking at is only 50%, that is a problem.
And also, things like where in the data chain where you're looking at, let's say-- where are you doing the data quality work? Are you doing it at the consumption point? So if you're not really looking at these files and you're not profiling them here, for example, and-- where are you doing it? Does it make sense to do it at the source and after transformation so you can see before and after?
That's really-- that's a very cool feature right there. I have not seen-- I guess you could have different levels of profiling at different stages during the lifecycle. Wow. That's awesome. Yeah. But think about that. It gives the organization ability to see the whole thing and see, does it really-- because data profiling takes a really long time. It eats up a whole bunch of cycles-- computer cycles-- and time, so you want to make sure you're doing it the right places. And obviously, organizations are actually trying to do a lot of that at the source, make sure that you do the data quality as the source.
Yeah, don't [? fast ?] track all through the system. I did have a question about auto-discovery. So once you have connected to a data source and regularly running the extraction process, is there notification that happens when new columns or tables show up?
Yeah. So I know we do. How it does it I can't tell you and show you actually. For example, when I was at another financial institution, the way they did it is the way we recommend it, which is a data-- they have a whole change management process tied to their data modeling system. So they would have a production environment and data modeling environment, and then the DDL and DML for that environment changes.
Then that creates another version of the data modeler, and that triggers, let's say, a new data being added to this ecosystem. And then you can do that kind of a discovery. How it does the automation to look at what's changed with the [? I ?] by itself, I do not know how to show that. But I know that we do do it. I just don't know how to show it.
OK. That's something that would be interested in seeing. Is that something that comes to the data steward, the data governance? How's the awareness? How are people notified? And then do we have a way-- just say, for example, if there were additional columns added to an existing table that already had a data steward. Could we automatically assign those objects the same data steward?
Yeah, that's a great question. And the way I would write that question down would be, how does the data intelligence auto-detect changes in the data source? And it's structurally, I would imagine. So let's say a column gets added, or subtracted, or altered. How does it detect that?
Yeah. Let's say somebody wants-- somebody changes a definition, and I'm an owner on that. Will I get notified that, hey, this definition was changed? And can I approve it or revert back to the original? Just those kind of things--
Yea. That I know we do. Nadeem actually does show that, whether he did that on yours or not. If your name is associated with a column, like a business term, then any changes that happens-- you can set up a workflow so that it's not automatically-- it goes through a two-step process somebody makes a change, then that gets collaborated, and then the change is either accepted or rejected, for example and that is done within the workflow. And everybody gets notified, and they get information on their My Action Center say, please review this change. That I know we do. How do we detect the changes--
[INAUDIBLE]. So are these some out-of-the-box workflows? Are these already present, in-built, or is it customized?
Yeah. So they're out of the box. There are a few, if you look at the demo, I think he did show. If he didn't, I apologize. But it's toward the end of the demo. And then the out-of-the-box ones are like templates, and then you can update them and rename them. Or you can create a completely new one by yourself. But it goes with the templates that most data governance organization would use, one. Most of them will go through two or three different steps of collaborative work before it actually gets inked.
OK. I don't know if I have any more questions. I think, really, the thing that I've wanted to get clear for them is that concept that, yes, we can have multiple dictionaries and data sources that are separated by entity but that there's a single person that's going to need to maintain this that will have access to everything, or group or something?
The admin, which is the login that I'm using, is the super user, which will go across. And then the RACI is the thing that creates the container so that data steward for finance, and data steward for HR, and data steward for something else will have access to either their container of glossaries or multiple ones, like HR. HR might have information across all the organization. However, what HR can see is also at level. They take a system database, a schema, and column level. So they won't have access to the information in finance, but they would have information about hire date or something like that that will go across the organization.
So there is the RACI chart that you can create to create that user permissions and what they have access to. And then the overall system could be monitored by the admin, but that admin could also be taken away so that they're not making any changes. One of the big thing, though, that I forgot to actually mention, is that everything you do on this system is audited, any changes that you make. So it will show on our system model audit to say, this changed on this date by this person.
Do you have a concept of domains? If I click on the down arrow for all, can I search by a business function?
Yeah, so let's define the domain. I think there are concepts introduced with data mesh, data fabrics, and things like that, where domain is a collection of data sets that has a single domain, if you will. So for example, market data might be a domain. Yeah, and that's absolutely possible.
OK. Yeah, it really helps organize the search results when you can limit it to a particular area, instead of [INAUDIBLE].
Yeah. And I think, if a user permission is such that a finance person can only look at finance business terms, glossaries, and metadata, then they won't be able to see other stuff anyway. But I know what you mean. And that is an interesting concept that may introduce some difficulties for organizations like Carlyle because, for example, you might have different business units, but do you have one finance organization? Do you have one risk organization? Well, then that's a difficult one. It's not a technical problem. It's more of the business [INAUDIBLE]. That makes sense.
OK. I think there was-- trying to think of if there was anything else. I'm assuming that pages are all customizable. You can add custom attributes to the landing page or the catalog page for an asset.
Yeah. Yeah, you can actually customize it with Carlyle logo or whatever the organization's logo, if you will, so that you can do a lot of customization.
Do you have sampling? I didn't know if they-- if you go into the columns, is there sampling available?
Oh, data sampling We were supposed to find that out for you. Let's add that as another item.
OK. Ask about profiling, too, if that's a--
Yeah. The profiling I know we do. So the definition of a sampling, though, as you've, I think, correctly pointed out, it was more to do with--
Just a random-- where it would grab the 100th record and show the content.
Yeah.
Can you click on one of the columns so I can see what the page looks like?
Yeah, a lot of these don't have-- for example, it's a demo environment, so it doesn't have all of the information really filled out, other than just the bare minimum.
If you add additional attributes, does it go under the column on the left or does it create tabs and things on the right?
I think, if you do something like that, you could have a window where you can update, and then you can save. I'm not going to do that, because I don't want to mess it up.
All right, Saikat, did you have anything else?
Nope, I don't think so.
So can we just review the questions so that we know exactly what we're getting back?
Yeah. Do you want to stop the recording, though?
Yeah.
OK.
And I would say, both of you guys, look at the last 15 minutes of the demo from the previous one. I know he went through it fairly quickly, but I know for a fact that was some of the things-- because I know, recording, what you can do is obviously stop and say, what the heck did he just do? So you can review it. And Brad, can you read off the questions and make sure that we got all of them? And then we need to follow up with some technology people.
Yeah. The ones I noted, just how do we-- what's the status of Crystal Reports? And are we using that? Are we going to continue with Crystal Reports for reporting?
Yep.
How do we show stewardship? And I know we went through a couple of those. Also, I noted something-- changes in workflow. How are admins notified of any changes that happen?
Yeah, I think we got that.
Through the source of the data. Yeah.
We got that answered, didn't we? If you're assigned to it or you follow it--
Yeah.
OK. Sampling was on there?
Yeah, random sampling-- how are we capturing that as well? Those are the ones that I took down.
So there was the sampling, and what was the first-- there's only two then, no?
Yeah, I think that's all I noted here, Crystal Reports and sampling.
OK. I'm going to get back to you, Carla, as soon as I can. My schedule is a little crazy today, so-- yeah.
There's no emergency. They're all going to be out for the holiday anyway. But I think this was a really good session, and if I can share this recording with them, I think that-- at least with the data governance folks-- that'll help them.