#bytescrolls: Creating index in Hive

Creating index in Hive

Simple:

CREATE INDEX idx ON TABLE tbl(col_name) AS 'Index_Handler_QClass_Name' IN TABLE tbl_idx;

As to make pluggable indexing algorithms, one has to mention the associated class name that handles indexing say for eg:-org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler
The index handler classes implement HiveIndexHandler
Full Syntax:

CREATE INDEX index_name
ON TABLE base_table_name (col_name, ...)
AS 'index.handler.class.name'
[WITH DEFERRED REBUILD]
[IDXPROPERTIES (property_name=property_value, ...)]
[IN TABLE index_table_name]
[PARTITIONED BY (col_name, ...)]
[
[ ROW FORMAT ...] STORED AS ...
| STORED BY ...
]
[LOCATION hdfs_path]
[TBLPROPERTIES (...)]
[COMMENT "index comment"]

WITH DEFERRED REBUILD - for newly created index is initially empty. REBUILD can be used to make the index up to date.

IDXPROPERTIES/TBLPROPERTIES - declaring keyspace properties

PARTITIONED BY - table columns where in the index get partitioned, if not specified index spans all table partitions

ROW FORMAT - custom SerDe or using native SerDe(Serializer/Deserializer for Hive read/write). A native SerDe is used if ROW FORMAT is not specified

STORED AS - index table storage format like RCFILE or SEQUENCFILE.The user has to uniquely specify tabl_idx name is required for a qualified index name across tables, otherwise they are named automatically. STORED BY - can be HBase (I haven't tried it)

The index can be stored in hive table or as RCFILE in an hdfs path etc. In this case, the implemented index handler class usesIndexTable() method will return false.When index is created, the generateIndexBuildTaskList(...) in index handler class will generate a plan for building the index.

Consider CompactIndexHandler from Hive distribution,

It only stores the addresses of HDFS blocks containing that value. The index is stored in hive metastore FieldSchema as _bucketname and _offsets in the index table.

ie the index table contains 3 columns, with _unparsed_column_names_from_field schema (indexed columns), _bucketname(table partition hdfs file having columns),[" _blockoffsets",..."]

See the code from CompactIndexHandler,

11 comments:

TejutejuMay 3, 2018 at 6:59 PM
It was really a nice article and i was really impressed by reading this Hadoop Admin Online Course Bangalore
ReplyDelete
Replies
Sağbilge TüzeJuly 16, 2021 at 3:41 AM
no deposit bonus forex 2021 - takipçi satın al - takipçi satın al - takipçi satın al - takipcialdim.com/tiktok-takipci-satin-al/ - instagram beğeni satın al - instagram beğeni satın al - google haritalara yer ekleme - btcturk - tiktok izlenme satın al - sms onay - youtube izlenme satın al - google haritalara yer ekleme - no deposit bonus forex 2021 - tiktok jeton hilesi - tiktok beğeni satın al - binance - takipçi satın al - uc satın al - finanspedia.com - sms onay - sms onay - tiktok takipçi satın al - tiktok beğeni satın al - twitter takipçi satın al - trend topic satın al - youtube abone satın al - instagram beğeni satın al - tiktok beğeni satın al - twitter takipçi satın al - trend topic satın al - youtube abone satın al - instagram beğeni satın al - tiktok takipçi satın al - tiktok beğeni satın al - twitter takipçi satın al - trend topic satın al - youtube abone satın al - instagram beğeni satın al - perde modelleri - instagram takipçi satın al - instagram takipçi satın al - cami avizesi - marsbahis
ReplyDelete
Replies
Alina BozJuly 27, 2021 at 2:22 AM
kayseriescortu.com - alacam.org - xescortun.com
ReplyDelete
Replies
야설December 19, 2021 at 7:52 PM
Simply want to say your article is as surprising. The clarity on your put up is simply spectacular and that I could assume you are a professional on this subject. Fine with your permission let me to take hold of your RSS feed to keep up to date with forthcoming post. Thanks a million and please carry on the enjoyable work.
야한동영상
휴게텔
외국인출장
마사지
ReplyDelete
Replies
TechystickFebruary 13, 2022 at 8:53 PM
world777 whitelabel
3 bhk flat in ajmer
class 10 tuition classes in gurgaon
palazzo kurti set online under 500
kurta sets and suit for women
azure firewall
azure blueprints
azure resource group
azure application gateway
azure express route
ReplyDelete
Replies
AnonymousMay 5, 2022 at 4:27 AM
Mmorpg Oyunlar
İNSTAGRAM TAKİPÇİ SATİN AL
tiktok jeton hilesi
Tiktok Jeton Hilesi
Saç ekim antalya
TAKİPCİ
instagram takipçi satın al
metin2 pvp serverlar
TAKİPÇİ
ReplyDelete
Replies
AnonymousJune 5, 2022 at 10:48 AM
pendik daikin klima servisi
tuzla toshiba klima servisi
tuzla beko klima servisi
çekmeköy bosch klima servisi
çekmeköy arçelik klima servisi
maltepe mitsubishi klima servisi
kadıköy mitsubishi klima servisi
kartal vestel klima servisi
kartal arçelik klima servisi
ReplyDelete
Replies
MasonethanAugust 18, 2022 at 4:28 PM
This comment has been removed by the author.
ReplyDelete
Replies
Avenir Digital StoriesDecember 20, 2022 at 10:43 AM
Good explanation. Are you looking for best digital marketing services? No look further Avenir Digital Stories is the best digital marketing consultant in India which provide SEO,SMO, web design, corporate stationary, ads banner design and google ads services at an affordable price. Contact us for

best social media agency in Gurgaon, content marketing agency Gurgaon,
best digital marketing services in Delhi
ReplyDelete
Replies
Tecblog1April 24, 2024 at 5:57 PM
Elevate your online presence with powerful dedicated server frankfurt.. Explore our Frankfurt options for global reach. Experience unmatched performance and reliability for your business needs.
ReplyDelete
Replies
aaradhya mehtaJune 19, 2024 at 10:08 PM
Escape to a luxurious best resort in jaipur, where royal elegance meets modern comfort. Enjoy world-class amenities, serene landscapes, and unforgettable experiences.
ReplyDelete
Replies

Subscribe to: Post Comments (Atom)